You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Jacek Furmankiewicz <ja...@gmail.com> on 2009/04/15 14:39:57 UTC
StrTokenizer not handling quotes correctly?
I am trying to use StrTokenizer for some parsing and I am probably not using
it correctly.
Let's say I have this string:
11"a,b"11,22"c,d"22"
I would like to split it by the comma ",", but ignoring any commas embedded
in quotes. I try this:
String test = "11\"a,b\"11,22\"c,d\"22";
StrTokenizer str = new StrTokenizer(test,',','"');
String[] tokens = str.getTokenArray();
for(String t: tokens) {
System.out.println(t);
}
and expect to have two strings print out:
11"a,b"11
22"c,d"22
but instead I get 4 :
11"a
b"11
22"c
d"22
It seems the tokenizer is splitting on the comma, even if it is embedded in
quotes.
I tried different options on the StrTokenizer, but not been able to get it
to work correctly.
Any idea as to what am I doing wrong? Using latest version 2.4.
Thanks, Jacek
Re: StrTokenizer not handling quotes correctly?
Posted by sebb <se...@gmail.com>.
On 15/04/2009, Jacek Furmankiewicz <ja...@gmail.com> wrote:
> I am trying to use StrTokenizer for some parsing and I am probably not using
> it correctly.
>
> Let's say I have this string:
>
> 11"a,b"11,22"c,d"22"
>
> I would like to split it by the comma ",", but ignoring any commas embedded
> in quotes. I try this:
>
> String test = "11\"a,b\"11,22\"c,d\"22";
> StrTokenizer str = new StrTokenizer(test,',','"');
> String[] tokens = str.getTokenArray();
>
> for(String t: tokens) {
> System.out.println(t);
> }
>
> and expect to have two strings print out:
>
> 11"a,b"11
> 22"c,d"22
>
> but instead I get 4 :
>
> 11"a
> b"11
> 22"c
> d"22
>
> It seems the tokenizer is splitting on the comma, even if it is embedded in
> quotes.
Quotes are only allowed in quoted strings. From the Javadoc:
"Each token may be surrounded by quotes. The quote matcher specifies
the quote character(s). A quote may be escaped within a quoted section
by duplicating itself. "
> I tried different options on the StrTokenizer, but not been able to get it
> to work correctly.
>
> Any idea as to what am I doing wrong? Using latest version 2.4.
The input needs to look like this:
"11""a,b""11","22""c,d""22""
> Thanks, Jacek
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org