You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@shindig.apache.org by Ziv Horesh <zh...@gmail.com> on 2010/06/14 19:12:28 UTC

Re: Removing conversion of & to & and vice versa when generating url (issue1632042)

If I understand correctly, the html definition call for escaping & with
&amp;, if only for the small chance of something actually want a value like
'&lt;' as is.
So I don't think we want the new logic in your change to disable it.
What we need is to fix the places it is not converted back correctly.


On Sat, Jun 12, 2010 at 10:20 PM, <ga...@gmail.com> wrote:

> Reviewers: johnfargo, shindig.remailer_gmail.com, zhoresh,
>
> Message:
>
> On 2010/06/09 23:04:12, zhoresh wrote:
>
>> Good catch in the MutableContent!
>>
>
> Thanks
>
>
>  http://codereview.appspot.com/1632042/diff/2001/3003
>> File
>>
>
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java
>
>> (left):
>>
>
>  http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169
>>
>
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169:
>
>> elem.getNamespaceURI() == null &&
>> Can you create a test case that show it is not needed? (or maybe I
>>
> didn't see
>
>> it)
>>
>
>  http://codereview.appspot.com/1632042/diff/2001/3005
>> File
>>
>
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java
>
>> (right):
>>
>
>  http://codereview.appspot.com/1632042/diff/2001/3005#newcode135
>>
>
>
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135:
>
>> concatUri.getUri().toString());
>> I guess there is a reason why it was added, maybe the right solution
>>
> is to fix
>
>> the parser?
>>
>
> I thought i understood what was going on, but now i don't.
>
> What I think is going on is:
> Elements created using Document.createElement() method have their
> getNamespaceURI as null.
> Elements that are parsed from html or copied / cloned around here and
> there have their getNamespaceURI as non-null.
>
> In both ConcatVistor and RenderingGadgetRewriter, we create elements
> using createElement and these will have getNamespaceURI() as null. We
> manually go and replace all occurances of '&' with '&amp;' in the url.
>
> In DefaultHtmlSerializer, we check for url's and the logic states what
> for everything other than a url, replace '&' with '&amp;'. But the logic
> for isUrl has the check element.getNamespaceURI() which will count only
> the elements created using createElement.
> So, all other urls that were parsed from html, get their '&' replaced by
> '&amp;'
>
> Thus, the actual behavior seems to be to replace '&' by '&amp;' but the
> intent of the code logic states otherwise.
>
> My main question is: why do we need this escaping & to &amp business. We
> should really not be doing such hardcoding stuff either.
>
> To make things even worse, in StyleTagExtractorVisitor, we create a
> <link> node but we dont seem to be replacing '&' with '&amp;' here. So
> we don't seem to be consistent either.
>
> Here is a list of all places where we use '&amp;':
>
> http://www.google.com/codesearch?hl=en&sa=N&filter=0&num=100&q=file:gadgets+file:shindig+file
> :\.java+\%26amp;+-file:Test.java
>
>
> To make things even worser, browsers seem to be treating &amp; in urls
> differently.
> Using the changes in current patch, i loaded
> view-source:
> http://localhost:9003/gadgets/accel?container=default&url=http://www.cnn.com/
>
> It has a concatenated script tag:
> <script
> src="
> http://localhost:9003/gadgets/concat?container=default&amp;gadget=http%3A%2F%2Fwww.cnn.com%2F&amp;debug=0&amp;nocache=0&amp;type=js&amp;json=false&amp;1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&amp;2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&amp;3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&amp;4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&amp;5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&amp;6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&amp;7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&amp;8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&amp;9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&amp;10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&amp;11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&amp;12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&amp;13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&amp;14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&amp;15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&amp;16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&amp;17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&amp;18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&amp;19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&amp;20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&amp;21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js<http://localhost:9003/gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com%2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js>
> "
> type="text/javascript"></script>
>
> Then i ran nc -l -p 9003
>
> If i click on the img src link from within firefox, the request i get is
> this:
> GET
> /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com
> %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%
> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%
> 2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com
> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%
> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js
> HTTP/1.1
>
> that is, with &amp; in url replaced by &
>
>
> Now if i directly copy and paste the exact link in the navigation bar,
> here is what i get:
>
> GET
> /gadgets/concat?container=default&amp;gadget=http%3A%2F%2Fwww.cnn.com
> %2F&amp;debug=0&amp;nocache=0&amp;type=js&amp;json=false&amp;1=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&amp;2=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&amp;3=http%3A%2F%
> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&amp;4=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&amp;5=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&amp;6=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&amp;7=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&amp;8=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&amp;9=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&amp;10=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&amp;11=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&amp;12=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&amp;13=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&amp;14=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&amp;15=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&amp;16=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&amp;17=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&amp;18=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&amp;19=http%3A%2F%
> 2Fwww.cnn.com
> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&amp;20=http%3A%2F%
> 2Fi.cdn.turner.com
> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&amp;21=http%3A%2F%
> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js
> HTTP/1.1
>
> that is, with &amp; as is.
> This url throws a 404 with the following error:
> MISSING_PARAMETER ..<url>.. Missing type.
>
>
> I have spent quite some time trying to think this through and come up
> with an acceptable solution. In the process, i ended up adding an option
> called gadgets.uri.escapeAmpersandToAndAmp in the shindig.properties
> file, hoping that i could encapsulate all the '&amp;' logic with this
> flag and this would be a non breaking change. Alas, im not fluent enough
> in Guice to accomplish that [I could not inject Injector into some
> classes and ended up modifying lot of code :( ]
>
>
> My suggestion is to get rid of escaping '&amp;' business for urls
> everywhere. Why do we need it anyways ?
> To make things worsest, codereview did not let me submit this reply and
> kept throwing invalid XSRF token or some crap.
> So end result = i'm confused and kind of giving up.
>
> Description:
> Currently we have code that converts & to &amp; and then &amp; in url
> back to &.
> There is a bug right now in the way that dynamic nodes are added (say
> from StyleConcatVisitor) for which the reconversion from &amp; to & does
> not happen.
>
>
> Please review this at http://codereview.appspot.com/1632042/show
>
> Affected files:
>  M     .gitignore
>  M     config/container.js
>  M     java/common/conf/shindig.properties
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializer.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriter.java
>  A
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/AbsolutePathReferenceRewriter.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ProxyingContentRewriter.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/RewriteModule.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ScriptConcatContentRewriter.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/StyleConcatContentRewriter.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/HtmlAccelServlet.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/spec/MessageBundle.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManager.java
>  M
> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManager.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/http/AbstractHttpFetcherTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java
>  A
> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializerTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriterTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/SanitizingGadgetRewriterTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/old/SanitizingGadgetRewriterTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/ConcatVisitorTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/DomWalkerTestBase.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/RewriterTestBase.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/old/BaseRewriterTestCase.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManagerTest.java
>  M
> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManagerTest.java
>
>
>

Re: Removing conversion of & to & and vice versa when generating url (issue1632042)

Posted by John Hjelmstad <jo...@gmail.com>.
+1 to that. The reason you can't copy-and-paste the URL into your browser
bar is that the URL is unescaped by the browser when it's the value of a
target attribute ie src="value".

Re: the XSRF issue on codereview, that happens occasionally with the tool
and is orthogonal. I've seen it occur when you start an operation but don't
complete it within a period of time (in the tool). A refresh usually fixes
the problem.

--j

On Mon, Jun 14, 2010 at 10:12 AM, Ziv Horesh <zh...@gmail.com> wrote:

>
> If I understand correctly, the html definition call for escaping & with
> &amp;, if only for the small chance of something actually want a value like
> '&lt;' as is.
> So I don't think we want the new logic in your change to disable it.
> What we need is to fix the places it is not converted back correctly.
>
>
>
> On Sat, Jun 12, 2010 at 10:20 PM, <ga...@gmail.com> wrote:
>
>> Reviewers: johnfargo, shindig.remailer_gmail.com, zhoresh,
>>
>> Message:
>>
>> On 2010/06/09 23:04:12, zhoresh wrote:
>>
>>> Good catch in the MutableContent!
>>>
>>
>> Thanks
>>
>>
>>  http://codereview.appspot.com/1632042/diff/2001/3003
>>> File
>>>
>>
>>
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java
>>
>>> (left):
>>>
>>
>>  http://codereview.appspot.com/1632042/diff/2001/3003#oldcode169
>>>
>>
>>
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java:169:
>>
>>> elem.getNamespaceURI() == null &&
>>> Can you create a test case that show it is not needed? (or maybe I
>>>
>> didn't see
>>
>>> it)
>>>
>>
>>  http://codereview.appspot.com/1632042/diff/2001/3005
>>> File
>>>
>>
>>
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java
>>
>>> (right):
>>>
>>
>>  http://codereview.appspot.com/1632042/diff/2001/3005#newcode135
>>>
>>
>>
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java:135:
>>
>>> concatUri.getUri().toString());
>>> I guess there is a reason why it was added, maybe the right solution
>>>
>> is to fix
>>
>>> the parser?
>>>
>>
>> I thought i understood what was going on, but now i don't.
>>
>> What I think is going on is:
>> Elements created using Document.createElement() method have their
>> getNamespaceURI as null.
>> Elements that are parsed from html or copied / cloned around here and
>> there have their getNamespaceURI as non-null.
>>
>> In both ConcatVistor and RenderingGadgetRewriter, we create elements
>> using createElement and these will have getNamespaceURI() as null. We
>> manually go and replace all occurances of '&' with '&amp;' in the url.
>>
>> In DefaultHtmlSerializer, we check for url's and the logic states what
>> for everything other than a url, replace '&' with '&amp;'. But the logic
>> for isUrl has the check element.getNamespaceURI() which will count only
>> the elements created using createElement.
>> So, all other urls that were parsed from html, get their '&' replaced by
>> '&amp;'
>>
>> Thus, the actual behavior seems to be to replace '&' by '&amp;' but the
>> intent of the code logic states otherwise.
>>
>> My main question is: why do we need this escaping & to &amp business. We
>> should really not be doing such hardcoding stuff either.
>>
>> To make things even worse, in StyleTagExtractorVisitor, we create a
>> <link> node but we dont seem to be replacing '&' with '&amp;' here. So
>> we don't seem to be consistent either.
>>
>> Here is a list of all places where we use '&amp;':
>>
>> http://www.google.com/codesearch?hl=en&sa=N&filter=0&num=100&q=file:gadgets+file:shindig+file
>> :\.java+\%26amp;+-file:Test.java
>>
>>
>> To make things even worser, browsers seem to be treating &amp; in urls
>> differently.
>> Using the changes in current patch, i loaded
>> view-source:
>> http://localhost:9003/gadgets/accel?container=default&url=http://www.cnn.com/
>>
>> It has a concatenated script tag:
>> <script
>> src="
>> http://localhost:9003/gadgets/concat?container=default&amp;gadget=http%3A%2F%2Fwww.cnn.com%2F&amp;debug=0&amp;nocache=0&amp;type=js&amp;json=false&amp;1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&amp;2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&amp;3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&amp;4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&amp;5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&amp;6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&amp;7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&amp;8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&amp;9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&amp;10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&amp;11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&amp;12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&amp;13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&amp;14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&amp;15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&amp;16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&amp;17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&amp;18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&amp;19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&amp;20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&amp;21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js<http://localhost:9003/gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com%2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%2Fi.cdn.turner.com%2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%2Fwww.cnn.com%2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js>
>> "
>> type="text/javascript"></script>
>>
>> Then i ran nc -l -p 9003
>>
>> If i click on the img src link from within firefox, the request i get is
>> this:
>> GET
>> /gadgets/concat?container=default&gadget=http%3A%2F%2Fwww.cnn.com
>> %2F&debug=0&nocache=0&type=js&json=false&1=http%3A%2F%2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&2=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&3=http%3A%2F%
>> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&4=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&5=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&6=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&7=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&8=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&9=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&10=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&11=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&12=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&13=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&14=http%3A%2F%
>> 2Fi.cdn.turner.com%2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&15=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&16=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&17=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&18=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&19=http%3A%2F%
>> 2Fwww.cnn.com
>> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&20=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&21=http%3A%2F%
>> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js
>> HTTP/1.1
>>
>> that is, with &amp; in url replaced by &
>>
>>
>> Now if i directly copy and paste the exact link in the navigation bar,
>> here is what i get:
>>
>> GET
>> /gadgets/concat?container=default&amp;gadget=http%3A%2F%2Fwww.cnn.com
>> %2F&amp;debug=0&amp;nocache=0&amp;type=js&amp;json=false&amp;1=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fomni.time.js&amp;2=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fprotoaculous.1.8.2.min.js&amp;3=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fmain.js&amp;4=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fswfobject-2.2.js&amp;5=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2FcsiManager.js&amp;6=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2FStorageManager.js&amp;7=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fconnect%2Fconnect-lite.js&amp;8=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Flocal.js%3F20100421&amp;9=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F2.0%2FcnnSkypeOpen.js&amp;10=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp_suppl.js&amp;11=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fvideo%2Fcvp.js&amp;12=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fxslo%2Fcvp%2Fads%2Ffreewheel%2Fjs%2Ffwjslib_1.1.js%3Fversion%3D1.1&amp;13=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F2.0%2Fad_head0.js&amp;14=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2Fcnn_adspaces%2Fcnn_adspaces.js&amp;15=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Foo_engine.js&amp;16=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fspecials%2Fworldcup%2Fwcdata.day.js&amp;17=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fweather.footer.js&amp;18=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fedition.vars.js&amp;19=http%3A%2F%
>> 2Fwww.cnn.com
>> %2F.element%2Fjs%2F3.0%2Fintl%2Fedition_footer.js&amp;20=http%3A%2F%
>> 2Fi.cdn.turner.com
>> %2Fcnn%2F.element%2Fjs%2F3.0%2Fs_code.intl.js&amp;21=http%3A%2F%
>> 2Fi.cdn.turner.com%2Fcnn%2F.element%2Fjs%2F3.0%2Fhpsectiontracking.js
>> HTTP/1.1
>>
>> that is, with &amp; as is.
>> This url throws a 404 with the following error:
>> MISSING_PARAMETER ..<url>.. Missing type.
>>
>>
>> I have spent quite some time trying to think this through and come up
>> with an acceptable solution. In the process, i ended up adding an option
>> called gadgets.uri.escapeAmpersandToAndAmp in the shindig.properties
>> file, hoping that i could encapsulate all the '&amp;' logic with this
>> flag and this would be a non breaking change. Alas, im not fluent enough
>> in Guice to accomplish that [I could not inject Injector into some
>> classes and ended up modifying lot of code :( ]
>>
>>
>> My suggestion is to get rid of escaping '&amp;' business for urls
>> everywhere. Why do we need it anyways ?
>> To make things worsest, codereview did not let me submit this reply and
>> kept throwing invalid XSRF token or some crap.
>> So end result = i'm confused and kind of giving up.
>>
>> Description:
>> Currently we have code that converts & to &amp; and then &amp; in url
>> back to &.
>> There is a bug right now in the way that dynamic nodes are added (say
>> from StyleConcatVisitor) for which the reconversion from &amp; to & does
>> not happen.
>>
>>
>> Please review this at http://codereview.appspot.com/1632042/show
>>
>> Affected files:
>>  M     .gitignore
>>  M     config/container.js
>>  M     java/common/conf/shindig.properties
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializer.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializer.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriter.java
>>  A
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/AbsolutePathReferenceRewriter.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ConcatVisitor.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ProxyingContentRewriter.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/RewriteModule.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/ScriptConcatContentRewriter.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/rewrite/StyleConcatContentRewriter.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/HtmlAccelServlet.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/spec/MessageBundle.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManager.java
>>  M
>> java/gadgets/src/main/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManager.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/http/AbstractHttpFetcherTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java
>>  A
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/DefaultHtmlSerializerTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/RenderingGadgetRewriterTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/SanitizingGadgetRewriterTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/render/old/SanitizingGadgetRewriterTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/ConcatVisitorTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/DomWalkerTestBase.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/RewriterTestBase.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/rewrite/old/BaseRewriterTestCase.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultConcatUriManagerTest.java
>>  M
>> java/gadgets/src/test/java/org/apache/shindig/gadgets/uri/DefaultProxyUriManagerTest.java
>>
>>
>>
>