You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@sling.apache.org by Steven Walters <st...@icidigital.com> on 2015/08/21 03:13:12 UTC

Further clarification of SLING-2512

I was working with the SlingPostServlet operation :import with the XML (not
the JCR XML) format and came across the code that is implicated in
SLING-2512.


This basically renders the entire nt:file import functionality completely
blocked from utilization when going through the SlingPostServlet, as it
always imports through a InputStream which as a result always has the
"xmlLocation" as null.

Yes, not having a base location prevents the use of relative URLs within
the imported XML content, as this would result in MalformedURLException
from not having a protocol.
But as a result, this completely blocks being able to have absolute URLs in
the imported XML content in this scenario.

As this is, local file system can not be imported and separate individual
uploads of the local file system content has to be utilized. This is
decently inconvenient to have to manage separate uploads for all the
individual files in this manner.


Is there any additional detail as to why SLING-2512 is a bug exactly, in
that it is classified as "should not be allowed"?


- Steven

Re: Further clarification of SLING-2512

Posted by "steven.walters" <st...@icidigital.com>.
Bertrand Delacretaz wrote
> Absolute URLs are fine, it's local file references that can be
> problematic, similar to your example.

I'm having trouble understanding what you're trying to state is intended to
be allowed versus not at this point.
I'm interpreting your separate posts over the thread into a contradiction:

A) Stated that local absolute file references such as
/file:/C:/test/sling/content/image1.png/ are not intended to be supported.
B) Now stating that absolute URLs are intended to be supported.

Is leading to a contradiction for me.
This may simply be either terminology discrepancies coming into effect, or
differences in perspective.

the test cases within the existing codebase have only relative file
references in them, there are no test cases that include absolute URLs.
(For such test cases that can run across all systems, this would require
some programmatic behavior to create the XMLs, so only somewhat
understandable that they are missing)

So maybe I'll try this from a different approach in explanation, more
specifically the behavioral analysis of the before and after of SLING-2512
changes
(For simplicity, assuming ideal scenario in which XML is properly formatted
and referenced content files exist, etc.)

example absolute URLs:
* file:/C:/test/sling/content/image1.png
* https://www.google.com/images/srpr/logo11w.png

example relative URLs:
* image1.png
* images/logo.png

*before SLING-2512:*
a) importing with absolute URLs with the parse(URL, ContentCreator)
methodology -> Works
b) importing with relative URLs with the parse(URL, ContentCreator)
methodology -> Works
c) importing with absolute URLs with the parse(InputStream, ContentCreator)
methodology -> Works
d) importing with relative URLs with the parse(InputStream, ContentCreator)
methodology -> MalformedURLException occurs

*after SLING-2512:*
a) importing with absolute URLs with parse(URL, ContentCreator) -> *Works*
b) importing with relative URLs with parse(URL, ContentCreator) -> Works
c) importing with absolute URLs with the parse(InputStream, ContentCreator) 
-> *Blocked*
d) importing with relative URLs with the parse(InputStream, ContentCreator)
-> *Blocked before causing MalformedURLException*

I am fully behind the change in behavior for scenario "/d/".
But I do not understand why scenario "/c/" degraded from being functional to
being blocked altogether, especially since scenario "/a/" still functions.
The current behavior appears overly inconsistent here.

This particular behavioral inconsistency is what I'm seeking clarification
on.

- Steven



--
View this message in context: http://apache-sling.73963.n3.nabble.com/Further-clarification-of-SLING-2512-tp4053816p4053904.html
Sent from the Sling - Users mailing list archive at Nabble.com.

Re: Further clarification of SLING-2512

Posted by Bertrand Delacretaz <bd...@apache.org>.
On Mon, Aug 24, 2015 at 4:08 PM, steven.walters
<st...@icidigital.com> wrote:
> ...If the intent was to not allow absolute URLs in the XML format to exclude
> the possibility of "arbitrary content", then that should've been the
> approach taken instead of the current approach of simply not importing if
> the would-be URL spec (/baseLocation/ here) is null....

Absolute URLs are fine, it's local file references that can be
problematic, similar to your example.

-Bertrand

Re: Further clarification of SLING-2512

Posted by "steven.walters" <st...@icidigital.com>.
Bertrand Delacretaz wrote
> I had another look and the functionality that SLING-2512 disables
> could in certain cases allow arbitrary files to be imported in the
> repository, which we don't want of course.

Actually this isn't a correct assessment from my analysis so far. the
implementation of SLING-2512 doesn't prevent the inclusion of the arbitrary
files,
it simply prevents the inclusion of any and all files when importing the
xmlLocation is unknown, which is when it is imported via an InputStream,
such as through the SlingPostServlet.

If you were to import the supplied XML through the other import mechanism
(through the URL parameter), those files would attempt to imported as they
are.
That is, there is nothing actually preventing the use of absolute URLs in
the XML format, since the only thing done in creation of the URL to import
is the small code snippet of
/new URL(baseLocation, value)/  (from
/org.apache.sling.jcr.contentloader.internal.readers.XmlReader.java:564/).
This native Java functionality is designed to where information is utilized
from the /baseLocation/ if that information is missing from the /value/.
that is, when constructing a new URL in this fashion with /baseLocation/
being "file:/C:/test/sling/sling-import.xml" and the /value/ being
"file:/D:/some/other/seemingly/unrelated/content/video1.mp4"
all information from /baseLocation/ is ignored, because /value/ is a
complete URL already.

If the intent was to not allow absolute URLs in the XML format to exclude
the possibility of "arbitrary content", then that should've been the
approach taken instead of the current approach of simply not importing if
the would-be URL spec (/baseLocation/ here) is null.

Regardless, this functionality is basically nonutilizable for some
intentions I was thinking of it, forcing my hand into a customization, which
is not currently desirable - the intent here was to stay OOTB, so this would
likely end up doing file uploads individually through the SlingPostServlet
instead.



--
View this message in context: http://apache-sling.73963.n3.nabble.com/Further-clarification-of-SLING-2512-tp4053816p4053852.html
Sent from the Sling - Users mailing list archive at Nabble.com.

Re: Further clarification of SLING-2512

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Fri, Aug 21, 2015 at 3:13 AM, Steven Walters
<st...@icidigital.com> wrote:
> ...Is there any additional detail as to why SLING-2512 is a bug exactly, in
> that it is classified as "should not be allowed"?...

I had another look and the functionality that SLING-2512 disables
could in certain cases allow arbitrary files to be imported in the
repository, which we don't want of course.

So it's correct for this to be disabled now, if you need it in your
environment I see two ways

a) Implement your own variant of that :import operation

b) Use a post-processor that imports the files - you might import only
their names/URLs from the XML and write an observation-triggered
processor that validates their names/locations and if valid imports
them.

-Bertrand

Re: Further clarification of SLING-2512

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Fri, Aug 21, 2015 at 7:56 PM, steven.walters
<st...@icidigital.com> wrote:
> ...Sure, this is basically around the importing of pre-existing content files,
> so the following is pretty minimal...

Your example didn't make it to the list but it's on nabble.com, here it is:

<?xml version="1.0" encoding="UTF-8"?>
<node xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
    <name>import-content-sling</name>
    <primaryNodeType>nt:unstructured</primaryNodeType>
    <nodes>
        <nt:file src="file:/C:/test/sling/content/image1.png"
mimeType="image/png" lastModified="2015-08-20T19:22:49-0400"/>
        <nt:file src="file:/C:/test/sling/content/video1.mp4"
mimeType="video/mp4"/>
    </nodes>
</node>

>
> I would expect that these files be imported into the JCR without issue or
> complication creating correspondingly named nodes (image1.png and
> video1.mp4),
> but instead the condition at
> *org.apache.sling.jcr.contentloader.internal.readers.XmlReader:246* fails,
> causing the "/file element encountered when xml location isn't known.
> skipping./" warning message to be triggered instead.
>
> The /xmlLocation/ is null because this operation is being triggered via the
> /parse(InputStream ins, ContentCreator creator)/ call, due to being called
> through the SlingPostServlet's ImportOperation  mechanism.
>

I don't remember off the top of my head exactly how this is supposed
to work now, will have a look later this week unless someone beats me
to it.

-Bertrand

Re: Further clarification of SLING-2512

Posted by "steven.walters" <st...@icidigital.com>.
Bertrand Delacretaz wrote
>> ...as a result, this completely blocks being able to have absolute URLs
>> in
>> the imported XML content in this scenario...
> 
> Could you provide a minimal example of what doesn't work for you?
> 
> Maybe an example XML file, indicating what doesn't match your
> expectations after importing.

Sure, this is basically around the importing of pre-existing content files,
so the following is pretty minimal.


I would expect that these files be imported into the JCR without issue or
complication creating correspondingly named nodes (image1.png and
video1.mp4),
but instead the condition at
*org.apache.sling.jcr.contentloader.internal.readers.XmlReader:246* fails,
causing the "/file element encountered when xml location isn't known.
skipping./" warning message to be triggered instead.

The /xmlLocation/ is null because this operation is being triggered via the
/parse(InputStream ins, ContentCreator creator)/ call, due to being called
through the SlingPostServlet's ImportOperation  mechanism.



--
View this message in context: http://apache-sling.73963.n3.nabble.com/Further-clarification-of-SLING-2512-tp4053816p4053833.html
Sent from the Sling - Users mailing list archive at Nabble.com.

Re: Further clarification of SLING-2512

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Fri, Aug 21, 2015 at 3:13 AM, Steven Walters
<st...@icidigital.com> wrote:
> ...as a result, this completely blocks being able to have absolute URLs in
> the imported XML content in this scenario...

Could you provide a minimal example of what doesn't work for you?

Maybe an example XML file, indicating what doesn't match your
expectations after importing.

-Bertrand