You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@groovy.apache.org by "David M. Karr" <da...@gmail.com> on 2016/02/08 23:17:09 UTC

Escaping unicode reference in slashy string

Someone was trying to point out difficulties with various string values 
in slashy strings.  I was able to refute most of his arguments, but he 
pointed out a curious issue involving unicode sequences.

If you have the following:
--------------
def var = /c:\uabc.txt/
---------------

This will fail to compile, as "\uabc." is not a valid unicode sequence.

So, the obvious thing to try is this:
------------------
def var = /c:\\uabc.txt/
------------------

That would fix it, right?  Well, sort of.  It doesn't get a compile 
error.  I expected it to produce "c:\uabc.txt", but instead it produced 
"c:\\uabc.txt".

What are relatively simple workarounds for this?

Re: Escaping unicode reference in slashy string

Posted by Paul King <pa...@asert.com.au>.
That's correct David. The dollar slashy string handles a bunch of edge
cases that arise with slashy strings in a more sane way but are still
parsed after unicode characters.

Cheers, Paul.

On Wed, Feb 10, 2016 at 2:28 AM, David M. Karr
<da...@gmail.com> wrote:
> On 02/08/2016 06:55 PM, Edinson E. Padrón Urdaneta wrote:
>
> Hi, David. Maybe this can be of help ~> {
> http://www.groovy-lang.org/single-page-documentation.html } The Dollar
> slashy string could be what you are looking for. Cheers.
>
>
> Just so it's clear, this doesn't deal with "\uabcd".  With one backslash, it
> fails with a compile error.  With two backslashes, it produces two
> backslashes.  From what I now understand, the only thing that works if you
> have "\u" in the string is a GString (double quotes).
>
>
> On Mon, Feb 8, 2016 at 9:38 PM, David M. Karr <da...@gmail.com>
> wrote:
>>
>> On 02/08/2016 05:40 PM, Paul King wrote:
>>>
>>> Unicode processing is done before anything else. For your case, you
>>> need to make it not look like a unicode sequence - which you rightly
>>> did with the double backslash variant. But you then need to pick the
>>> GString form that does the appropriate thing with the sequence of
>>> characters that passed through the initial parsing stages. So the
>>> trick is to use a normal GString not a slashy string:
>>>
>>> def var = "c:\\uabc.txt"
>>
>>
>> I can see it can get complicated to select the correct form, depending on
>> problematic characters in the string.  Is there anything like the Perl
>> "qw()" function, which I believe augments a string with any required quoting
>> to retain all the original characters?
>>>
>>>
>>> Cheers, Paul.
>>>
>>> On Tue, Feb 9, 2016 at 8:17 AM, David M. Karr
>>> <da...@gmail.com> wrote:
>>>>
>>>> Someone was trying to point out difficulties with various string values
>>>> in
>>>> slashy strings.  I was able to refute most of his arguments, but he
>>>> pointed
>>>> out a curious issue involving unicode sequences.
>>>>
>>>> If you have the following:
>>>> --------------
>>>> def var = /c:\uabc.txt/
>>>> ---------------
>>>>
>>>> This will fail to compile, as "\uabc." is not a valid unicode sequence.
>>>>
>>>> So, the obvious thing to try is this:
>>>> ------------------
>>>> def var = /c:\\uabc.txt/
>>>> ------------------
>>>>
>>>> That would fix it, right?  Well, sort of.  It doesn't get a compile
>>>> error.
>>>> I expected it to produce "c:\uabc.txt", but instead it produced
>>>> "c:\\uabc.txt".
>>>>
>>>> What are relatively simple workarounds for this?
>>
>>
>
>

Re: Escaping unicode reference in slashy string

Posted by "David M. Karr" <da...@gmail.com>.
On 02/08/2016 06:55 PM, Edinson E. Padrón Urdaneta wrote:
> Hi, David. Maybe this can be of help ~> { 
> http://www.groovy-lang.org/single-page-documentation.html } The Dollar 
> slashy string could be what you are looking for. Cheers.

Just so it's clear, this doesn't deal with "\uabcd".  With one 
backslash, it fails with a compile error.  With two backslashes, it 
produces two backslashes.  From what I now understand, the only thing 
that works if you have "\u" in the string is a GString (double quotes).
>
> On Mon, Feb 8, 2016 at 9:38 PM, David M. Karr 
> <davidmichaelkarr@gmail.com <ma...@gmail.com>> wrote:
>
>     On 02/08/2016 05:40 PM, Paul King wrote:
>
>         Unicode processing is done before anything else. For your
>         case, you
>         need to make it not look like a unicode sequence - which you
>         rightly
>         did with the double backslash variant. But you then need to
>         pick the
>         GString form that does the appropriate thing with the sequence of
>         characters that passed through the initial parsing stages. So the
>         trick is to use a normal GString not a slashy string:
>
>         def var = "c:\\uabc.txt"
>
>
>     I can see it can get complicated to select the correct form,
>     depending on problematic characters in the string.  Is there
>     anything like the Perl "qw()" function, which I believe augments a
>     string with any required quoting to retain all the original
>     characters?
>
>
>         Cheers, Paul.
>
>         On Tue, Feb 9, 2016 at 8:17 AM, David M. Karr
>         <davidmichaelkarr@gmail.com
>         <ma...@gmail.com>> wrote:
>
>             Someone was trying to point out difficulties with various
>             string values in
>             slashy strings.  I was able to refute most of his
>             arguments, but he pointed
>             out a curious issue involving unicode sequences.
>
>             If you have the following:
>             --------------
>             def var = /c:\uabc.txt/
>             ---------------
>
>             This will fail to compile, as "\uabc." is not a valid
>             unicode sequence.
>
>             So, the obvious thing to try is this:
>             ------------------
>             def var = /c:\\uabc.txt/
>             ------------------
>
>             That would fix it, right?  Well, sort of.  It doesn't get
>             a compile error.
>             I expected it to produce "c:\uabc.txt", but instead it
>             produced
>             "c:\\uabc.txt".
>
>             What are relatively simple workarounds for this?
>
>
>


Re: Escaping unicode reference in slashy string

Posted by "Edinson E. Padrón Urdaneta" <ed...@gmail.com>.
Hi, David. Maybe this can be of help ~> {
http://www.groovy-lang.org/single-page-documentation.html } The Dollar
slashy string could be what you are looking for. Cheers.

On Mon, Feb 8, 2016 at 9:38 PM, David M. Karr <da...@gmail.com>
wrote:

> On 02/08/2016 05:40 PM, Paul King wrote:
>
>> Unicode processing is done before anything else. For your case, you
>> need to make it not look like a unicode sequence - which you rightly
>> did with the double backslash variant. But you then need to pick the
>> GString form that does the appropriate thing with the sequence of
>> characters that passed through the initial parsing stages. So the
>> trick is to use a normal GString not a slashy string:
>>
>> def var = "c:\\uabc.txt"
>>
>
> I can see it can get complicated to select the correct form, depending on
> problematic characters in the string.  Is there anything like the Perl
> "qw()" function, which I believe augments a string with any required
> quoting to retain all the original characters?
>
>>
>> Cheers, Paul.
>>
>> On Tue, Feb 9, 2016 at 8:17 AM, David M. Karr
>> <da...@gmail.com> wrote:
>>
>>> Someone was trying to point out difficulties with various string values
>>> in
>>> slashy strings.  I was able to refute most of his arguments, but he
>>> pointed
>>> out a curious issue involving unicode sequences.
>>>
>>> If you have the following:
>>> --------------
>>> def var = /c:\uabc.txt/
>>> ---------------
>>>
>>> This will fail to compile, as "\uabc." is not a valid unicode sequence.
>>>
>>> So, the obvious thing to try is this:
>>> ------------------
>>> def var = /c:\\uabc.txt/
>>> ------------------
>>>
>>> That would fix it, right?  Well, sort of.  It doesn't get a compile
>>> error.
>>> I expected it to produce "c:\uabc.txt", but instead it produced
>>> "c:\\uabc.txt".
>>>
>>> What are relatively simple workarounds for this?
>>>
>>
>

Re: Escaping unicode reference in slashy string

Posted by "David M. Karr" <da...@gmail.com>.
On 02/08/2016 05:40 PM, Paul King wrote:
> Unicode processing is done before anything else. For your case, you
> need to make it not look like a unicode sequence - which you rightly
> did with the double backslash variant. But you then need to pick the
> GString form that does the appropriate thing with the sequence of
> characters that passed through the initial parsing stages. So the
> trick is to use a normal GString not a slashy string:
>
> def var = "c:\\uabc.txt"

I can see it can get complicated to select the correct form, depending 
on problematic characters in the string.  Is there anything like the 
Perl "qw()" function, which I believe augments a string with any 
required quoting to retain all the original characters?
>
> Cheers, Paul.
>
> On Tue, Feb 9, 2016 at 8:17 AM, David M. Karr
> <da...@gmail.com> wrote:
>> Someone was trying to point out difficulties with various string values in
>> slashy strings.  I was able to refute most of his arguments, but he pointed
>> out a curious issue involving unicode sequences.
>>
>> If you have the following:
>> --------------
>> def var = /c:\uabc.txt/
>> ---------------
>>
>> This will fail to compile, as "\uabc." is not a valid unicode sequence.
>>
>> So, the obvious thing to try is this:
>> ------------------
>> def var = /c:\\uabc.txt/
>> ------------------
>>
>> That would fix it, right?  Well, sort of.  It doesn't get a compile error.
>> I expected it to produce "c:\uabc.txt", but instead it produced
>> "c:\\uabc.txt".
>>
>> What are relatively simple workarounds for this?


Re: Escaping unicode reference in slashy string

Posted by Paul King <pa...@asert.com.au>.
Unicode processing is done before anything else. For your case, you
need to make it not look like a unicode sequence - which you rightly
did with the double backslash variant. But you then need to pick the
GString form that does the appropriate thing with the sequence of
characters that passed through the initial parsing stages. So the
trick is to use a normal GString not a slashy string:

def var = "c:\\uabc.txt"

Cheers, Paul.

On Tue, Feb 9, 2016 at 8:17 AM, David M. Karr
<da...@gmail.com> wrote:
> Someone was trying to point out difficulties with various string values in
> slashy strings.  I was able to refute most of his arguments, but he pointed
> out a curious issue involving unicode sequences.
>
> If you have the following:
> --------------
> def var = /c:\uabc.txt/
> ---------------
>
> This will fail to compile, as "\uabc." is not a valid unicode sequence.
>
> So, the obvious thing to try is this:
> ------------------
> def var = /c:\\uabc.txt/
> ------------------
>
> That would fix it, right?  Well, sort of.  It doesn't get a compile error.
> I expected it to produce "c:\uabc.txt", but instead it produced
> "c:\\uabc.txt".
>
> What are relatively simple workarounds for this?