You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Jianfeng Jia <ji...@gmail.com> on 2015/09/29 03:03:40 UTC
Undefined behavior for substring-before() and substring-after() in match-not-found case
Hi Devs,
Another question about the string functions.
The example code on the http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions <http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions> shows that these two function are suppose to be called after contains(). I wonder what is the expected behavior if the they can't find the match pattern?
The current result is confusing.
e.g.
let $x := "substring"
return [ substring-before($x, "subx"), substring-after($x, “subx”)]
it will return
[ [ "subst", "" ]
]
Should we always return an empty string in such case, or throw an exception like “you shall filter the result by contain() first” ?
IMHO, I’d like to return a null string. Any opinion?
Best,
Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine
Re: Undefined behavior for substring-before() and substring-after()
in match-not-found case
Posted by Mike Carey <dt...@gmail.com>.
Agreed!
On 9/28/15 9:59 PM, Chris Hillery wrote:
> Beyond the signature, the documentation of the XQuery function says :
>
> "If the value of $arg1 does not contain a string that is equal to the value
> of $arg2, then the function returns the zero-length string."
>
> So I'd say your inference is correct.
>
> The XQuery doc also explains what happens if $arg1 or $arg2 is empty, and
> we should probably emulate that as well.
>
> Ceej
> aka Chris Hillery
> On Sep 28, 2015 9:29 PM, "Jianfeng Jia" <ji...@gmail.com> wrote:
>
>> Thanks for the great summary provided by Taewoo!
>>
>> The XQuery’s signature shows that it always returns a string:
>> fn:substring-before($arg1 as xs:string?, $arg2 as xs:string?) as xs:string
>>
>> And the Marklogic's returns an option[string].
>> fn.substringBefore(
>> $input <https://docs.marklogic.com/fn.substringBefore#input> as
>> String?,
>> $before <https://docs.marklogic.com/fn.substringBefore#before> as
>> String?,
>> [$collation <https://docs.marklogic.com/fn.substringBefore#collation>
>> as String]
>> ) as String?
>> Since all the rest string functions are either return a string or throw
>> exceptions, I think return an empty string should be a consistent behavior.
>>
>>
>>> On Sep 28, 2015, at 9:12 PM, Mike Carey <dt...@gmail.com> wrote:
>>>
>>> Yes, and the Marklogic entry reminded me - the answer should probably be
>> modeled (for us) after XQuery - where the answers are fully spelled out
>> already (having been debated by a group of smart people first and
>> implemented by a bunch of XQuery engine providers):
>>> http://www.w3.org/TR/xpath-functions-30/#func-substring-before
>>> http://www.w3.org/TR/xpath-functions-30/#func-substring-after
>>> Cheers,
>>> Mike
>>>
>>> On 9/28/15 6:10 PM, Taewoo Kim wrote:
>>>> Perhaps we can start from here:
>>>>
>> https://docs.google.com/spreadsheets/d/1j6_YSCc_8gEReAWFP84geI30wlnsz7uMFq4TCm7GRz8/edit?usp=sharing
>>>>
>>>> Best,
>>>> Taewoo
>>>>
>>>> On Mon, Sep 28, 2015 at 6:05 PM, Mike Carey <dt...@gmail.com> wrote:
>>>>
>>>>> At times like this it's useful to take a quick look at what other
>> systems
>>>>> do, if they have such functions - e.g., are there precedents we should
>> base
>>>>> our answer on? (In Java, Postgres, MySQL, ...)
>>>>>
>>>>>
>>>>> On 9/28/15 6:03 PM, Jianfeng Jia wrote:
>>>>>
>>>>>> Hi Devs,
>>>>>>
>>>>>> Another question about the string functions.
>>>>>>
>>>>>> The example code on the
>>>>>>
>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
>>>>>> <
>>>>>>
>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
>>>>>> shows that these two function are suppose to be called after
>> contains(). I
>>>>>> wonder what is the expected behavior if the they can't find the match
>>>>>> pattern?
>>>>>>
>>>>>> The current result is confusing.
>>>>>>
>>>>>> e.g.
>>>>>> let $x := "substring"
>>>>>> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
>>>>>>
>>>>>> it will return
>>>>>> [ [ "subst", "" ]
>>>>>> ]
>>>>>> Should we always return an empty string in such case, or throw an
>>>>>> exception like “you shall filter the result by contain() first” ?
>>>>>> IMHO, I’d like to return a null string. Any opinion?
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jianfeng Jia
>>>>>> PhD Candidate of Computer Science
>>>>>> University of California, Irvine
>>>>>>
>>>>>>
>>>>>>
>>
>>
>> Best,
>>
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>>
>>
Re: Undefined behavior for substring-before() and substring-after()
in match-not-found case
Posted by Chris Hillery <ch...@hillery.land>.
Beyond the signature, the documentation of the XQuery function says :
"If the value of $arg1 does not contain a string that is equal to the value
of $arg2, then the function returns the zero-length string."
So I'd say your inference is correct.
The XQuery doc also explains what happens if $arg1 or $arg2 is empty, and
we should probably emulate that as well.
Ceej
aka Chris Hillery
On Sep 28, 2015 9:29 PM, "Jianfeng Jia" <ji...@gmail.com> wrote:
> Thanks for the great summary provided by Taewoo!
>
> The XQuery’s signature shows that it always returns a string:
> fn:substring-before($arg1 as xs:string?, $arg2 as xs:string?) as xs:string
>
> And the Marklogic's returns an option[string].
> fn.substringBefore(
> $input <https://docs.marklogic.com/fn.substringBefore#input> as
> String?,
> $before <https://docs.marklogic.com/fn.substringBefore#before> as
> String?,
> [$collation <https://docs.marklogic.com/fn.substringBefore#collation>
> as String]
> ) as String?
> Since all the rest string functions are either return a string or throw
> exceptions, I think return an empty string should be a consistent behavior.
>
>
> > On Sep 28, 2015, at 9:12 PM, Mike Carey <dt...@gmail.com> wrote:
> >
> > Yes, and the Marklogic entry reminded me - the answer should probably be
> modeled (for us) after XQuery - where the answers are fully spelled out
> already (having been debated by a group of smart people first and
> implemented by a bunch of XQuery engine providers):
> > http://www.w3.org/TR/xpath-functions-30/#func-substring-before
> > http://www.w3.org/TR/xpath-functions-30/#func-substring-after
> > Cheers,
> > Mike
> >
> > On 9/28/15 6:10 PM, Taewoo Kim wrote:
> >> Perhaps we can start from here:
> >>
> https://docs.google.com/spreadsheets/d/1j6_YSCc_8gEReAWFP84geI30wlnsz7uMFq4TCm7GRz8/edit?usp=sharing
> >>
> >>
> >> Best,
> >> Taewoo
> >>
> >> On Mon, Sep 28, 2015 at 6:05 PM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >>> At times like this it's useful to take a quick look at what other
> systems
> >>> do, if they have such functions - e.g., are there precedents we should
> base
> >>> our answer on? (In Java, Postgres, MySQL, ...)
> >>>
> >>>
> >>> On 9/28/15 6:03 PM, Jianfeng Jia wrote:
> >>>
> >>>> Hi Devs,
> >>>>
> >>>> Another question about the string functions.
> >>>>
> >>>> The example code on the
> >>>>
> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
> >>>> <
> >>>>
> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
> >
> >>>> shows that these two function are suppose to be called after
> contains(). I
> >>>> wonder what is the expected behavior if the they can't find the match
> >>>> pattern?
> >>>>
> >>>> The current result is confusing.
> >>>>
> >>>> e.g.
> >>>> let $x := "substring"
> >>>> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
> >>>>
> >>>> it will return
> >>>> [ [ "subst", "" ]
> >>>> ]
> >>>> Should we always return an empty string in such case, or throw an
> >>>> exception like “you shall filter the result by contain() first” ?
> >>>> IMHO, I’d like to return a null string. Any opinion?
> >>>>
> >>>>
> >>>> Best,
> >>>>
> >>>> Jianfeng Jia
> >>>> PhD Candidate of Computer Science
> >>>> University of California, Irvine
> >>>>
> >>>>
> >>>>
> >
>
>
>
> Best,
>
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
>
>
Re: Undefined behavior for substring-before() and substring-after() in match-not-found case
Posted by Jianfeng Jia <ji...@gmail.com>.
Thanks for the great summary provided by Taewoo!
The XQuery’s signature shows that it always returns a string:
fn:substring-before($arg1 as xs:string?, $arg2 as xs:string?) as xs:string
And the Marklogic's returns an option[string].
fn.substringBefore(
$input <https://docs.marklogic.com/fn.substringBefore#input> as String?,
$before <https://docs.marklogic.com/fn.substringBefore#before> as String?,
[$collation <https://docs.marklogic.com/fn.substringBefore#collation> as String]
) as String?
Since all the rest string functions are either return a string or throw exceptions, I think return an empty string should be a consistent behavior.
> On Sep 28, 2015, at 9:12 PM, Mike Carey <dt...@gmail.com> wrote:
>
> Yes, and the Marklogic entry reminded me - the answer should probably be modeled (for us) after XQuery - where the answers are fully spelled out already (having been debated by a group of smart people first and implemented by a bunch of XQuery engine providers):
> http://www.w3.org/TR/xpath-functions-30/#func-substring-before
> http://www.w3.org/TR/xpath-functions-30/#func-substring-after
> Cheers,
> Mike
>
> On 9/28/15 6:10 PM, Taewoo Kim wrote:
>> Perhaps we can start from here:
>> https://docs.google.com/spreadsheets/d/1j6_YSCc_8gEReAWFP84geI30wlnsz7uMFq4TCm7GRz8/edit?usp=sharing
>>
>>
>> Best,
>> Taewoo
>>
>> On Mon, Sep 28, 2015 at 6:05 PM, Mike Carey <dt...@gmail.com> wrote:
>>
>>> At times like this it's useful to take a quick look at what other systems
>>> do, if they have such functions - e.g., are there precedents we should base
>>> our answer on? (In Java, Postgres, MySQL, ...)
>>>
>>>
>>> On 9/28/15 6:03 PM, Jianfeng Jia wrote:
>>>
>>>> Hi Devs,
>>>>
>>>> Another question about the string functions.
>>>>
>>>> The example code on the
>>>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
>>>> <
>>>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions>
>>>> shows that these two function are suppose to be called after contains(). I
>>>> wonder what is the expected behavior if the they can't find the match
>>>> pattern?
>>>>
>>>> The current result is confusing.
>>>>
>>>> e.g.
>>>> let $x := "substring"
>>>> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
>>>>
>>>> it will return
>>>> [ [ "subst", "" ]
>>>> ]
>>>> Should we always return an empty string in such case, or throw an
>>>> exception like “you shall filter the result by contain() first” ?
>>>> IMHO, I’d like to return a null string. Any opinion?
>>>>
>>>>
>>>> Best,
>>>>
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>>
>>>>
>>>>
>
Best,
Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine
Re: Undefined behavior for substring-before() and substring-after()
in match-not-found case
Posted by Mike Carey <dt...@gmail.com>.
Yes, and the Marklogic entry reminded me - the answer should probably be
modeled (for us) after XQuery - where the answers are fully spelled out
already (having been debated by a group of smart people first and
implemented by a bunch of XQuery engine providers):
http://www.w3.org/TR/xpath-functions-30/#func-substring-before
http://www.w3.org/TR/xpath-functions-30/#func-substring-after
Cheers,
Mike
On 9/28/15 6:10 PM, Taewoo Kim wrote:
> Perhaps we can start from here:
> https://docs.google.com/spreadsheets/d/1j6_YSCc_8gEReAWFP84geI30wlnsz7uMFq4TCm7GRz8/edit?usp=sharing
>
>
> Best,
> Taewoo
>
> On Mon, Sep 28, 2015 at 6:05 PM, Mike Carey <dt...@gmail.com> wrote:
>
>> At times like this it's useful to take a quick look at what other systems
>> do, if they have such functions - e.g., are there precedents we should base
>> our answer on? (In Java, Postgres, MySQL, ...)
>>
>>
>> On 9/28/15 6:03 PM, Jianfeng Jia wrote:
>>
>>> Hi Devs,
>>>
>>> Another question about the string functions.
>>>
>>> The example code on the
>>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
>>> <
>>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions>
>>> shows that these two function are suppose to be called after contains(). I
>>> wonder what is the expected behavior if the they can't find the match
>>> pattern?
>>>
>>> The current result is confusing.
>>>
>>> e.g.
>>> let $x := "substring"
>>> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
>>>
>>> it will return
>>> [ [ "subst", "" ]
>>> ]
>>> Should we always return an empty string in such case, or throw an
>>> exception like “you shall filter the result by contain() first” ?
>>> IMHO, I’d like to return a null string. Any opinion?
>>>
>>>
>>> Best,
>>>
>>> Jianfeng Jia
>>> PhD Candidate of Computer Science
>>> University of California, Irvine
>>>
>>>
>>>
Re: Undefined behavior for substring-before() and substring-after()
in match-not-found case
Posted by Taewoo Kim <wa...@gmail.com>.
Perhaps we can start from here:
https://docs.google.com/spreadsheets/d/1j6_YSCc_8gEReAWFP84geI30wlnsz7uMFq4TCm7GRz8/edit?usp=sharing
Best,
Taewoo
On Mon, Sep 28, 2015 at 6:05 PM, Mike Carey <dt...@gmail.com> wrote:
> At times like this it's useful to take a quick look at what other systems
> do, if they have such functions - e.g., are there precedents we should base
> our answer on? (In Java, Postgres, MySQL, ...)
>
>
> On 9/28/15 6:03 PM, Jianfeng Jia wrote:
>
>> Hi Devs,
>>
>> Another question about the string functions.
>>
>> The example code on the
>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions
>> <
>> http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions>
>> shows that these two function are suppose to be called after contains(). I
>> wonder what is the expected behavior if the they can't find the match
>> pattern?
>>
>> The current result is confusing.
>>
>> e.g.
>> let $x := "substring"
>> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
>>
>> it will return
>> [ [ "subst", "" ]
>> ]
>> Should we always return an empty string in such case, or throw an
>> exception like “you shall filter the result by contain() first” ?
>> IMHO, I’d like to return a null string. Any opinion?
>>
>>
>> Best,
>>
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>>
>>
>>
>
Re: Undefined behavior for substring-before() and substring-after()
in match-not-found case
Posted by Mike Carey <dt...@gmail.com>.
At times like this it's useful to take a quick look at what other
systems do, if they have such functions - e.g., are there precedents we
should base our answer on? (In Java, Postgres, MySQL, ...)
On 9/28/15 6:03 PM, Jianfeng Jia wrote:
> Hi Devs,
>
> Another question about the string functions.
>
> The example code on the http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions <http://asterixdb.ics.uci.edu/documentation/aql/functions.html#StringFunctions> shows that these two function are suppose to be called after contains(). I wonder what is the expected behavior if the they can't find the match pattern?
>
> The current result is confusing.
>
> e.g.
> let $x := "substring"
> return [ substring-before($x, "subx"), substring-after($x, “subx”)]
>
> it will return
> [ [ "subst", "" ]
> ]
> Should we always return an empty string in such case, or throw an exception like “you shall filter the result by contain() first” ?
> IMHO, I’d like to return a null string. Any opinion?
>
>
> Best,
>
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
>
>