You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by Mark Mielke <ma...@gmail.com> on 2018/08/07 03:00:09 UTC

Regression: BasicCookieStore no longer Serializable after HttpClient 4.5.3

Hi all:

I have code that relies on BasicCookieStore being Serializable. After
updating from HttpClient 4.5.3 to 4.5.6 I found that this code was broken.

The ObjectInputStream.readObject() and ObjectOutputStream.writeObject()
continue to work, but any code that accesses the cookies fails with an
error like:

Exception in thread "main" java.lang.NullPointerException
at
org.apache.http.impl.client.BasicCookieStore.getCookies(BasicCookieStore.java:116)

Looking at this change, which was introduced into HttpClient 4.5.4:

https://github.com/apache/httpcomponents-client/pull/81/commits/c6413a01ddcfcde47c0008dca89d076cb2fe5176

I believe the issue is that a new "lock" variable was introduced, but the
"lock" variable is not Serializable in its current form. When the object is
reconstructed as a result of ObjectInputStream.readObject(), the "lock"
variable is being left as null. Then, code like this:

    public List<Cookie> getCookies() { lock.readLock().lock(); try {
//create defensive copy so it won't be concurrently modified return new
ArrayList<>(cookies); } finally { lock.readLock().unlock(); } }

Is failing, because "lock" is null, and "lock.readLock()" therefore returns
a NullPointerException.

Another problem here, is that the serialVersionUID has not been updated, so
this allows the new object layout and the old object layout to be treated
as interchangeable by the ObjectInputStream, preventing it from detecting
the conflict in layout.

I am considering working around this, by serializing the List<Cookie>
instead of the BasicCookieStore, but I think this could affect others and
it should be fixed.

I think the options here include:

   1. Revert this change. It isn't clear to me that this change provides
   significant value. The "synchronized" vs "ReadWriteLock" only provides
   value under significant load. Is the extra complexity really worth it?
   2. Fix the code, by declaring "lock" as transient, and providing
   readObject and writeObject() methods to deal with this non-Serializable
   field.
   3. Fix the code by dynamically allocating a "lock" if it is found to be
   null.

Personally, I would like to see the case for using a ReadWriteLock here,
and if the case is not good - KISS is a good principle, and a revert would
be the right starting point.

What do you think?

Once you provide feedback, I'm willing to implement the change.

Thanks!

-- 
Mark Mielke <ma...@gmail.com>

Re: Regression: BasicCookieStore no longer Serializable after HttpClient 4.5.3

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Wed, 2018-08-08 at 07:02 -0400, Mark Mielke wrote:
> On Tue, Aug 7, 2018 at 3:17 AM Oleg Kalnichevski <ol...@apache.org>
> wrote:
> 
> > On Mon, 2018-08-06 at 23:00 -0400, Mark Mielke wrote:
> > Looking at this change, which was introduced into HttpClient 4.5.4:
> > 
> > https://github.com/apache/httpcomponents-client/pull/81/commits/c64
> > 13
> > > a01ddcfcde47c0008dca89d076cb2fe5176
> > > I believe the issue is that a new "lock" variable was introduced,
> > > but the
> > > "lock" variable is not Serializable in its current form. When the
> > > object
> > 
> > is
> 
> ...
> 
> > > Another problem here, is that the serialVersionUID has not been
> > 
> > updated, so this allows the new object layout and the old object
> > layout
> > to be
> > > treated as interchangeable by the ObjectInputStream, preventing
> > > it from
> > > detecting the conflict in layout.
> 
> ...
> 
> > > I think the options here include:
> > > 
> > >    1. Revert this change. It isn't clear to me that this change
> > > provides
> > >    significant value. The "synchronized" vs "ReadWriteLock" only
> > > provides
> > >    value under significant load. Is the extra complexity really
> > > worth it?
> > >    2. Fix the code, by declaring "lock" as transient, and
> > > providing
> > >    readObject and writeObject() methods to deal with this
> > 
> > non- Serializable
> > >    field.
> > 
> > I cannot really say how justified it is but other people wanted it
> > and
> > were willing to provide a patch. I suggest that we keep things as
> > is
> > but fix the serialization problem.
> > 
> 
> 
> Thanks, Oleg.
> 
> Upon closer inspection, I found that ReentrantReadWriteLock is in
> fact
> Serializable, and that HttpClient has a test case for ensuring that
> BasicCookieStore can be serialized, restored, and accessed. This
> means that
> on its own, without consideration of other versions, it would appear
> to
> work, and this is why it got past testing.
> 
> The problem I am encountering, is that I have data that was
> previously
> serialized using 4.5.3, that appears to read without errors using
> 4.5.6,
> but as soon as I try to use the restored objects, such as
> cookieStore.getCookies(), I get unexpected NullPointerException. The
> problem in this case, is that the new "lock" field, although
> Serializable
> itself, did not exist in 4.5.3, so that field was not serialized in
> 4.5.3,
> and not restored in 4.5.6. ObjectInputStream.defaultReadObject() does
> not
> call the constructor, but instead follows its own logic to restore
> the
> object, which is to match serialized fields with serialized values.
> Since
> the new field didn't exist prior, it is initialized to the default
> value of
> null upon restore, and subsequent use of "lock" results in
> NullPointerException.
> 
> I think there are two perspectives on what is wrong and how it should
> be
> fixed:
> 
> 1) Intent to introduce a new persistent field, and break
> compatibility with
> existing data stores. In this case, the serialVersionUID should have
> been
> updated, which would have communicated that the data previously
> stored is
> not compatible with new data. I think this could be acceptable if
> this was
> an important persistent field, and if it was introduced in a major
> update
> like HttpClient 4.x to HttpClient 5.x. I do not think this is
> acceptable
> for a patch update like HttpClient 4.5.3 to 4.5.4. In any case - if
> compatibility was broken, a new serialVersionUID should have been
> generated, so the caller would at least be able to know that
> something was
> wrong, and the persistent state was no longer valid. In my case, my
> code
> would have issued a WARNING: to the user, and proceeded with a new
> empty
> cookie store.
> 
> 2) Intent to improve performance, but not break compatibility or
> introduce
> new important persistent state. In this case, there are two real
> options:
> - 2a) Do not change the persistent fields. The new "lock" should be
> "transient" as it is not an essential part of the persistent storage.
> Implement readObject() to properly initiate "lock" during object
> restoration.
> - 2b) Change the persistent fields, but implement a readObject() that
> overwrites or fills in the null "lock" field if restored from a prior
> version.
> - In bove of these cases, I think "lock" cannot be "final", as we
> need to
> be able to set it to a new ReentrantWriteLock() after object
> construction,
> but before object use. Without magic, "final" would have to be
> dropped from
> the qualifier to allow this to occur and pass Java source
> compilation.
> 
> Note that 4.5.4, 2.5.5, 4.5.6, and the 4.6.x and 5.x branches are all
> in a
> problematic state right now, as they are all publishing the new
> BasicCookieStore under the old serialVersionUID.
> 
> I also noted that ReentrantWriteLock may have had serialization
> problems in
> the past, and I think it makes HttpClient unnecessarily vulnerable,
> and the
> serialized data unnecessarily fat, to serialize this data structure
> when it
> can just be re-instantiated at restore time.
> 
> For this reason, I'm going to suggest we implement 2a), to try and
> allow
> the data to automatically "heal" by upgrading to a fixed version.
> 
> I have posted the code here:
> 
> https://github.com/apache/httpcomponents-client/pull/108
> 
> Please review and if this change is satisfactory, please merge to
> 4.5.x and
> all future branches.
> 
> Thanks!
> 

Fully agree that serialization of ReentrantWriteLock is completely
unnecessary. 

I like my stuff final but in this case making the ReentrantWriteLock
attribute non-final is justified. 

Your PR got merged into 4.5.x, 4,6.x and master.

Many thanks for contributing it.

Cheers

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Regression: BasicCookieStore no longer Serializable after HttpClient 4.5.3

Posted by Mark Mielke <ma...@gmail.com>.
On Tue, Aug 7, 2018 at 3:17 AM Oleg Kalnichevski <ol...@apache.org> wrote:

> On Mon, 2018-08-06 at 23:00 -0400, Mark Mielke wrote:

> Looking at this change, which was introduced into HttpClient 4.5.4:
>
> https://github.com/apache/httpcomponents-client/pull/81/commits/c6413
> > a01ddcfcde47c0008dca89d076cb2fe5176
> > I believe the issue is that a new "lock" variable was introduced, but the
> > "lock" variable is not Serializable in its current form. When the object
> is

...

> > Another problem here, is that the serialVersionUID has not been
>
> updated, so this allows the new object layout and the old object layout
> to be
> > treated as interchangeable by the ObjectInputStream, preventing it from
> > detecting the conflict in layout.
>
...

> > I think the options here include:
> >
> >    1. Revert this change. It isn't clear to me that this change provides
> >    significant value. The "synchronized" vs "ReadWriteLock" only provides
> >    value under significant load. Is the extra complexity really worth it?
> >    2. Fix the code, by declaring "lock" as transient, and providing
> >    readObject and writeObject() methods to deal with this
> non- Serializable
> >    field.
>
> I cannot really say how justified it is but other people wanted it and
> were willing to provide a patch. I suggest that we keep things as is
> but fix the serialization problem.
>


Thanks, Oleg.

Upon closer inspection, I found that ReentrantReadWriteLock is in fact
Serializable, and that HttpClient has a test case for ensuring that
BasicCookieStore can be serialized, restored, and accessed. This means that
on its own, without consideration of other versions, it would appear to
work, and this is why it got past testing.

The problem I am encountering, is that I have data that was previously
serialized using 4.5.3, that appears to read without errors using 4.5.6,
but as soon as I try to use the restored objects, such as
cookieStore.getCookies(), I get unexpected NullPointerException. The
problem in this case, is that the new "lock" field, although Serializable
itself, did not exist in 4.5.3, so that field was not serialized in 4.5.3,
and not restored in 4.5.6. ObjectInputStream.defaultReadObject() does not
call the constructor, but instead follows its own logic to restore the
object, which is to match serialized fields with serialized values. Since
the new field didn't exist prior, it is initialized to the default value of
null upon restore, and subsequent use of "lock" results in
NullPointerException.

I think there are two perspectives on what is wrong and how it should be
fixed:

1) Intent to introduce a new persistent field, and break compatibility with
existing data stores. In this case, the serialVersionUID should have been
updated, which would have communicated that the data previously stored is
not compatible with new data. I think this could be acceptable if this was
an important persistent field, and if it was introduced in a major update
like HttpClient 4.x to HttpClient 5.x. I do not think this is acceptable
for a patch update like HttpClient 4.5.3 to 4.5.4. In any case - if
compatibility was broken, a new serialVersionUID should have been
generated, so the caller would at least be able to know that something was
wrong, and the persistent state was no longer valid. In my case, my code
would have issued a WARNING: to the user, and proceeded with a new empty
cookie store.

2) Intent to improve performance, but not break compatibility or introduce
new important persistent state. In this case, there are two real options:
- 2a) Do not change the persistent fields. The new "lock" should be
"transient" as it is not an essential part of the persistent storage.
Implement readObject() to properly initiate "lock" during object
restoration.
- 2b) Change the persistent fields, but implement a readObject() that
overwrites or fills in the null "lock" field if restored from a prior
version.
- In bove of these cases, I think "lock" cannot be "final", as we need to
be able to set it to a new ReentrantWriteLock() after object construction,
but before object use. Without magic, "final" would have to be dropped from
the qualifier to allow this to occur and pass Java source compilation.

Note that 4.5.4, 2.5.5, 4.5.6, and the 4.6.x and 5.x branches are all in a
problematic state right now, as they are all publishing the new
BasicCookieStore under the old serialVersionUID.

I also noted that ReentrantWriteLock may have had serialization problems in
the past, and I think it makes HttpClient unnecessarily vulnerable, and the
serialized data unnecessarily fat, to serialize this data structure when it
can just be re-instantiated at restore time.

For this reason, I'm going to suggest we implement 2a), to try and allow
the data to automatically "heal" by upgrading to a fixed version.

I have posted the code here:

https://github.com/apache/httpcomponents-client/pull/108

Please review and if this change is satisfactory, please merge to 4.5.x and
all future branches.

Thanks!

-- 
Mark Mielke <ma...@gmail.com>

Re: Regression: BasicCookieStore no longer Serializable after HttpClient 4.5.3

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2018-08-06 at 23:00 -0400, Mark Mielke wrote:
> Hi all:
> 
> I have code that relies on BasicCookieStore being Serializable. After
> updating from HttpClient 4.5.3 to 4.5.6 I found that this code was
> broken.
> 
> The ObjectInputStream.readObject() and
> ObjectOutputStream.writeObject()
> continue to work, but any code that accesses the cookies fails with
> an
> error like:
> 
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.http.impl.client.BasicCookieStore.getCookies(BasicCookieSt
> ore.java:116)
> 
> Looking at this change, which was introduced into HttpClient 4.5.4:
> 
> https://github.com/apache/httpcomponents-client/pull/81/commits/c6413
> a01ddcfcde47c0008dca89d076cb2fe5176
> 
> I believe the issue is that a new "lock" variable was introduced, but
> the
> "lock" variable is not Serializable in its current form. When the
> object is
> reconstructed as a result of ObjectInputStream.readObject(), the
> "lock"
> variable is being left as null. Then, code like this:
> 
>     public List<Cookie> getCookies() { lock.readLock().lock(); try {
> //create defensive copy so it won't be concurrently modified return
> new
> ArrayList<>(cookies); } finally { lock.readLock().unlock(); } }
> 
> Is failing, because "lock" is null, and "lock.readLock()" therefore
> returns
> a NullPointerException.
> 
> Another problem here, is that the serialVersionUID has not been
> updated, so
> this allows the new object layout and the old object layout to be
> treated
> as interchangeable by the ObjectInputStream, preventing it from
> detecting
> the conflict in layout.
> 
> I am considering working around this, by serializing the List<Cookie>
> instead of the BasicCookieStore, but I think this could affect others
> and
> it should be fixed.
> 
> I think the options here include:
> 
>    1. Revert this change. It isn't clear to me that this change
> provides
>    significant value. The "synchronized" vs "ReadWriteLock" only
> provides
>    value under significant load. Is the extra complexity really worth
> it?
>    2. Fix the code, by declaring "lock" as transient, and providing
>    readObject and writeObject() methods to deal with this non-
> Serializable
>    field.

I cannot really say how justified it is but other people wanted it and
were willing to provide a patch. I suggest that we keep things as is
but fix the serialization problem.

Cheers

Oleg 


>    3. Fix the code by dynamically allocating a "lock" if it is found
> to be
>    null.
> 
> Personally, I would like to see the case for using a ReadWriteLock
> here,
> and if the case is not good - KISS is a good principle, and a revert
> would
> be the right starting point.
> 
> What do you think?
> 
> Once you provide feedback, I'm willing to implement the change.
> 
> Thanks!
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org