You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Henry Hung <YT...@winbond.com> on 2014/06/16 05:20:44 UTC

RegexStringComparator problem: Why pattern "u" has the same result as ".*u.*" ?

I have this data set and the value I want to test is "cf:c" = "hung":

hbase(main):001:0> scan 'TEST'
ROW                                                          COLUMN+CELL
\x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:a, timestamp=1402649511909, value=abc
\x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:b, timestamp=1402649511909, value=\x00\x00\x00\x02
\x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:c, timestamp=1402649511909, value=def
\x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:d, timestamp=1402649511909, value=\x00\x00\x01F\x93\x81s\xA8
\x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:a, timestamp=1402649610557, value=abc
\x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:b, timestamp=1402649610557, value=\x00\x00\x00\x03
\x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:c, timestamp=1402649610557, value=def
\x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:d, timestamp=1402649610557, value=\x00\x00\x01F\x93\x81s\xA8
\x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:a, timestamp=1402650015602, value=abc
\x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:b, timestamp=1402650015602, value=\x00\x00\x00\x04
\x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:c, timestamp=1402650015602, value=def
\x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:d, timestamp=1402650015602, value=\x00\x00\x01F\x93\x81s\xA8
\x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:a, timestamp=1402886404698, value=henry
\x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:b, timestamp=1402886404698, value=\x00\x00\x00\x06
\x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:c, timestamp=1402886404698, value=hung
\x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:d, timestamp=1402886404698, value=\x00\x00\x01F\xA2\x8A\xBD\xA0
\x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:a, timestamp=1402650022755, value=abcdef
\x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:b, timestamp=1402650022755, value=\x00\x00\x00\x01
\x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:c, timestamp=1402650022755, value=def
\x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:d, timestamp=1402650022755, value=\x00\x00\x01F\x93\x81s\xA8
\x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:a, timestamp=1402650025763, value=abcdef
\x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:b, timestamp=1402650025763, value=\x00\x00\x00\x02
\x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:c, timestamp=1402650025763, value=def
\x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:d, timestamp=1402650025763, value=\x00\x00\x01F\x93\x81s\xA8
6 row(s) in 0.1090 seconds


I wrote some program to test it:

HTable conn = new HTable(HBaseConfiguration.create(), "TEST");
try {
                Scan scan = new Scan();
                RegexStringComparator comp = new RegexStringComparator("u");
                SingleColumnValueFilter filter =new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("c"), CompareOp.EQUAL, comp);
                FilterList filters = new FilterList(Operator.MUST_PASS_ALL);
                filters.addFilter(filter);
                scan.setFilter(filters);
                ResultScanner rs = conn.getScanner(scan);
                try {
                                Result r = rs.next();
                                System.out.println(Bytes.toString(r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("c"))));
                }
                finally {
                                rs.close();
                }
}
finally {
                conn.close();
}

Because I use regex "u" as the value comparator, the program should throw a null value exception.
But when execute it, the result is "hung".

Question is why the SingleColumnValueFilter do not abide the regex comparator? Or why is regex comparator "u" is the same as ".*u.*"?

Best regards,
Henry Hung

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Re: RegexStringComparator problem: Why pattern "u" has the same result as ".*u.*" ?

Posted by Ted Yu <yu...@gmail.com>.
You can pass a subclass of RegexStringComparator to SingleColumnValueFilter.
This subclass can use matches() to suit your needs.

RegexStringComparator has been in 0.94 release for so long, its
implementation cannot be changed.

Cheers


On Sun, Jun 15, 2014 at 8:50 PM, Henry Hung <YT...@winbond.com> wrote:

> I found out the problem:
>
> I think I know what is going on, inside the RegexStringComparator, the
> compareTo is using find() rather than matches():
>
>   public int compareTo(byte[] value, int offset, int length) {
>     // Use find() for subsequence match instead of matches() (full sequence
>     // match) to adhere to the principle of least surprise.
>     String tmp;
>     if (length < value.length / 2) {
>       // See HBASE-9428. Make a copy of the relevant part of the byte[],
>       // or the JDK will copy the entire byte[] during String decode
>       tmp = new String(Arrays.copyOfRange(value, offset, offset + length),
> charset);
>     } else {
>       tmp = new String(value, offset, length, charset);
>     }
>     return pattern.matcher(tmp).find() ? 0 : 1;
>   }
>
>
> I use a simple program to test the difference between matches() and find():
>
> String s = "hung";
> Pattern p = Pattern.compile("u", Pattern.DOTALL);
> Matcher m = p.matcher(s);
> System.out.println(m.matches()); // return false
> System.out.println(m.find());          // return true
>
> p = Pattern.compile(".*u.*", Pattern.DOTALL);
> m = p.matcher(s);
> System.out.println(m.matches()); // return true
> System.out.println(m.find());          // return false
>
> The method matches() is what I needed right now, and to me it is more
> reasonable to use, but I don't know how to change it without modify the
> source code.
>
> @Ted:
> What you are suggesting is true, but for our user base it rather
> counterintuitive, because we are accustomed to searching keyword with
> expression "abc.*" to search with prefix "abc" rather than have to
> explicitly use "^abc.*".
> If I can't change the RegexStringComparator  compareTo() method from
> "find()" to "matches()", then I suppose I can implement a hard fix by
> adding "^" at the beginning of search keyword.
> Thanks for you quick responses.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Monday, June 16, 2014 11:32 AM
> To: user@hbase.apache.org
> Subject: Re: RegexStringComparator problem: Why pattern "u" has the same
> result as ".*u.*" ?
>
> "u" is part of "hung", producing a match.
>
> Do you want to find string whose value is "u" (not a substring) ?
> In that case you can specify "^u$"
>
> Cheers
>
>
> On Sun, Jun 15, 2014 at 8:20 PM, Henry Hung <YT...@winbond.com> wrote:
>
> >
> > I have this data set and the value I want to test is "cf:c" = "hung":
> >
> > hbase(main):001:0> scan 'TEST'
> > ROW                                                          COLUMN+CELL
> > \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:a,
> > timestamp=1402649511909, value=abc
> > \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:b,
> > timestamp=1402649511909, value=\x00\x00\x00\x02
> > \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:c,
> > timestamp=1402649511909, value=def
> > \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:d,
> > timestamp=1402649511909, value=\x00\x00\x01F\x93\x81s\xA8
> > \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:a,
> > timestamp=1402649610557, value=abc
> > \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:b,
> > timestamp=1402649610557, value=\x00\x00\x00\x03
> > \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:c,
> > timestamp=1402649610557, value=def
> > \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:d,
> > timestamp=1402649610557, value=\x00\x00\x01F\x93\x81s\xA8
> > \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:a,
> > timestamp=1402650015602, value=abc
> > \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:b,
> > timestamp=1402650015602, value=\x00\x00\x00\x04
> > \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:c,
> > timestamp=1402650015602, value=def
> > \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:d,
> > timestamp=1402650015602, value=\x00\x00\x01F\x93\x81s\xA8
> > \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:a,
> > timestamp=1402886404698, value=henry
> > \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:b,
> > timestamp=1402886404698, value=\x00\x00\x00\x06
> > \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:c,
> > timestamp=1402886404698, value=hung
> > \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:d,
> > timestamp=1402886404698, value=\x00\x00\x01F\xA2\x8A\xBD\xA0
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:a,
> > timestamp=1402650022755, value=abcdef
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:b,
> > timestamp=1402650022755, value=\x00\x00\x00\x01
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:c,
> > timestamp=1402650022755, value=def
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:d,
> > timestamp=1402650022755, value=\x00\x00\x01F\x93\x81s\xA8
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:a,
> > timestamp=1402650025763, value=abcdef
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:b,
> > timestamp=1402650025763, value=\x00\x00\x00\x02
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:c,
> > timestamp=1402650025763, value=def
> > \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:d,
> > timestamp=1402650025763, value=\x00\x00\x01F\x93\x81s\xA8
> > 6 row(s) in 0.1090 seconds
> >
> >
> > I wrote some program to test it:
> >
> > HTable conn = new HTable(HBaseConfiguration.create(), "TEST"); try {
> >                 Scan scan = new Scan();
> >                 RegexStringComparator comp = new
> > RegexStringComparator("u");
> >                 SingleColumnValueFilter filter =new
> > SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("c"),
> > CompareOp.EQUAL, comp);
> >                 FilterList filters = new
> > FilterList(Operator.MUST_PASS_ALL);
> >                 filters.addFilter(filter);
> >                 scan.setFilter(filters);
> >                 ResultScanner rs = conn.getScanner(scan);
> >                 try {
> >                                 Result r = rs.next();
> >
> > System.out.println(Bytes.toString(r.getValue(Bytes.toBytes("cf"),
> > Bytes.toBytes("c"))));
> >                 }
> >                 finally {
> >                                 rs.close();
> >                 }
> > }
> > finally {
> >                 conn.close();
> > }
> >
> > Because I use regex "u" as the value comparator, the program should
> > throw a null value exception.
> > But when execute it, the result is "hung".
> >
> > Question is why the SingleColumnValueFilter do not abide the regex
> > comparator? Or why is regex comparator "u" is the same as ".*u.*"?
> >
> > Best regards,
> > Henry Hung
> >
> > ________________________________
> > The privileged confidential information contained in this email is
> > intended for use only by the addressees as indicated by the original
> > sender of this email. If you are not the addressee indicated in this
> > email or are not responsible for delivery of the email to such a
> > person, please kindly reply to the sender indicating this fact and
> > delete all copies of it from your computer and network server
> > immediately. Your cooperation is highly appreciated. It is advised
> > that any unauthorized use of confidential information of Winbond is
> > strictly prohibited; and any information in this email irrelevant to
> > the official business of Winbond shall be deemed as neither given nor
> endorsed by Winbond.
> >
>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.
>

RE: RegexStringComparator problem: Why pattern "u" has the same result as ".*u.*" ?

Posted by Henry Hung <YT...@winbond.com>.
I found out the problem:

I think I know what is going on, inside the RegexStringComparator, the compareTo is using find() rather than matches():

  public int compareTo(byte[] value, int offset, int length) {
    // Use find() for subsequence match instead of matches() (full sequence
    // match) to adhere to the principle of least surprise.
    String tmp;
    if (length < value.length / 2) {
      // See HBASE-9428. Make a copy of the relevant part of the byte[],
      // or the JDK will copy the entire byte[] during String decode
      tmp = new String(Arrays.copyOfRange(value, offset, offset + length), charset);
    } else {
      tmp = new String(value, offset, length, charset);
    }
    return pattern.matcher(tmp).find() ? 0 : 1;
  }


I use a simple program to test the difference between matches() and find():

String s = "hung";
Pattern p = Pattern.compile("u", Pattern.DOTALL);
Matcher m = p.matcher(s);
System.out.println(m.matches()); // return false
System.out.println(m.find());          // return true

p = Pattern.compile(".*u.*", Pattern.DOTALL);
m = p.matcher(s);
System.out.println(m.matches()); // return true
System.out.println(m.find());          // return false

The method matches() is what I needed right now, and to me it is more reasonable to use, but I don't know how to change it without modify the source code.

@Ted:
What you are suggesting is true, but for our user base it rather counterintuitive, because we are accustomed to searching keyword with expression "abc.*" to search with prefix "abc" rather than have to explicitly use "^abc.*".
If I can't change the RegexStringComparator  compareTo() method from "find()" to "matches()", then I suppose I can implement a hard fix by adding "^" at the beginning of search keyword.
Thanks for you quick responses.

Best regards,
Henry

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Monday, June 16, 2014 11:32 AM
To: user@hbase.apache.org
Subject: Re: RegexStringComparator problem: Why pattern "u" has the same result as ".*u.*" ?

"u" is part of "hung", producing a match.

Do you want to find string whose value is "u" (not a substring) ?
In that case you can specify "^u$"

Cheers


On Sun, Jun 15, 2014 at 8:20 PM, Henry Hung <YT...@winbond.com> wrote:

>
> I have this data set and the value I want to test is "cf:c" = "hung":
>
> hbase(main):001:0> scan 'TEST'
> ROW                                                          COLUMN+CELL
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:a,
> timestamp=1402649511909, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:b,
> timestamp=1402649511909, value=\x00\x00\x00\x02
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:c,
> timestamp=1402649511909, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:d,
> timestamp=1402649511909, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:a,
> timestamp=1402649610557, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:b,
> timestamp=1402649610557, value=\x00\x00\x00\x03
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:c,
> timestamp=1402649610557, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:d,
> timestamp=1402649610557, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:a,
> timestamp=1402650015602, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:b,
> timestamp=1402650015602, value=\x00\x00\x00\x04
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:c,
> timestamp=1402650015602, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:d,
> timestamp=1402650015602, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:a,
> timestamp=1402886404698, value=henry
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:b,
> timestamp=1402886404698, value=\x00\x00\x00\x06
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:c,
> timestamp=1402886404698, value=hung
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:d,
> timestamp=1402886404698, value=\x00\x00\x01F\xA2\x8A\xBD\xA0
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:a,
> timestamp=1402650022755, value=abcdef
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:b,
> timestamp=1402650022755, value=\x00\x00\x00\x01
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:c,
> timestamp=1402650022755, value=def
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:d,
> timestamp=1402650022755, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:a,
> timestamp=1402650025763, value=abcdef
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:b,
> timestamp=1402650025763, value=\x00\x00\x00\x02
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:c,
> timestamp=1402650025763, value=def
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:d,
> timestamp=1402650025763, value=\x00\x00\x01F\x93\x81s\xA8
> 6 row(s) in 0.1090 seconds
>
>
> I wrote some program to test it:
>
> HTable conn = new HTable(HBaseConfiguration.create(), "TEST"); try {
>                 Scan scan = new Scan();
>                 RegexStringComparator comp = new
> RegexStringComparator("u");
>                 SingleColumnValueFilter filter =new
> SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("c"),
> CompareOp.EQUAL, comp);
>                 FilterList filters = new
> FilterList(Operator.MUST_PASS_ALL);
>                 filters.addFilter(filter);
>                 scan.setFilter(filters);
>                 ResultScanner rs = conn.getScanner(scan);
>                 try {
>                                 Result r = rs.next();
>
> System.out.println(Bytes.toString(r.getValue(Bytes.toBytes("cf"),
> Bytes.toBytes("c"))));
>                 }
>                 finally {
>                                 rs.close();
>                 }
> }
> finally {
>                 conn.close();
> }
>
> Because I use regex "u" as the value comparator, the program should
> throw a null value exception.
> But when execute it, the result is "hung".
>
> Question is why the SingleColumnValueFilter do not abide the regex
> comparator? Or why is regex comparator "u" is the same as ".*u.*"?
>
> Best regards,
> Henry Hung
>
> ________________________________
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original
> sender of this email. If you are not the addressee indicated in this
> email or are not responsible for delivery of the email to such a
> person, please kindly reply to the sender indicating this fact and
> delete all copies of it from your computer and network server
> immediately. Your cooperation is highly appreciated. It is advised
> that any unauthorized use of confidential information of Winbond is
> strictly prohibited; and any information in this email irrelevant to
> the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
>

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Re: RegexStringComparator problem: Why pattern "u" has the same result as ".*u.*" ?

Posted by Ted Yu <yu...@gmail.com>.
"u" is part of "hung", producing a match.

Do you want to find string whose value is "u" (not a substring) ?
In that case you can specify "^u$"

Cheers


On Sun, Jun 15, 2014 at 8:20 PM, Henry Hung <YT...@winbond.com> wrote:

>
> I have this data set and the value I want to test is "cf:c" = "hung":
>
> hbase(main):001:0> scan 'TEST'
> ROW                                                          COLUMN+CELL
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:a,
> timestamp=1402649511909, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:b,
> timestamp=1402649511909, value=\x00\x00\x00\x02
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:c,
> timestamp=1402649511909, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x02                         column=cf:d,
> timestamp=1402649511909, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:a,
> timestamp=1402649610557, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:b,
> timestamp=1402649610557, value=\x00\x00\x00\x03
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:c,
> timestamp=1402649610557, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x03                         column=cf:d,
> timestamp=1402649610557, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:a,
> timestamp=1402650015602, value=abc
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:b,
> timestamp=1402650015602, value=\x00\x00\x00\x04
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:c,
> timestamp=1402650015602, value=def
> \x00\x00\x00\x03abc\x00\x00\x00\x04                         column=cf:d,
> timestamp=1402650015602, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:a,
> timestamp=1402886404698, value=henry
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:b,
> timestamp=1402886404698, value=\x00\x00\x00\x06
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:c,
> timestamp=1402886404698, value=hung
> \x00\x00\x00\x05henry\x00\x00\x00\x06                       column=cf:d,
> timestamp=1402886404698, value=\x00\x00\x01F\xA2\x8A\xBD\xA0
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:a,
> timestamp=1402650022755, value=abcdef
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:b,
> timestamp=1402650022755, value=\x00\x00\x00\x01
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:c,
> timestamp=1402650022755, value=def
> \x00\x00\x00\x06abcdef\x00\x00\x00\x01                      column=cf:d,
> timestamp=1402650022755, value=\x00\x00\x01F\x93\x81s\xA8
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:a,
> timestamp=1402650025763, value=abcdef
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:b,
> timestamp=1402650025763, value=\x00\x00\x00\x02
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:c,
> timestamp=1402650025763, value=def
> \x00\x00\x00\x06abcdef\x00\x00\x00\x02                      column=cf:d,
> timestamp=1402650025763, value=\x00\x00\x01F\x93\x81s\xA8
> 6 row(s) in 0.1090 seconds
>
>
> I wrote some program to test it:
>
> HTable conn = new HTable(HBaseConfiguration.create(), "TEST");
> try {
>                 Scan scan = new Scan();
>                 RegexStringComparator comp = new
> RegexStringComparator("u");
>                 SingleColumnValueFilter filter =new
> SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("c"),
> CompareOp.EQUAL, comp);
>                 FilterList filters = new
> FilterList(Operator.MUST_PASS_ALL);
>                 filters.addFilter(filter);
>                 scan.setFilter(filters);
>                 ResultScanner rs = conn.getScanner(scan);
>                 try {
>                                 Result r = rs.next();
>
> System.out.println(Bytes.toString(r.getValue(Bytes.toBytes("cf"),
> Bytes.toBytes("c"))));
>                 }
>                 finally {
>                                 rs.close();
>                 }
> }
> finally {
>                 conn.close();
> }
>
> Because I use regex "u" as the value comparator, the program should throw
> a null value exception.
> But when execute it, the result is "hung".
>
> Question is why the SingleColumnValueFilter do not abide the regex
> comparator? Or why is regex comparator "u" is the same as ".*u.*"?
>
> Best regards,
> Henry Hung
>
> ________________________________
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.
>