You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Tokayer, Jason M." <Ja...@capitalone.com> on 2017/02/18 19:35:47 UTC

FilterList with a ColumnPaginationFilter in Java (Scala) Client

Hello,

I am having some difficulty understanding the results when I apply a ColumnPaginationFilter within a FilterList. I’m not sure whether this is an Hbase bug or a gap in my understanding of how the API works.

Specifically, I’m noticing a difference between using MUST_PASS_ONE vs MUST_PASS_ALL in my filterList even when I only have a single filter in the list. I walk through a full, but simplified (ie I took out the other filters in the list because I have narrowed down the problem; but I still do need to use a filterList), example below that illustrated the issue:

First, in the shell I create a table and insert multiple values with the same timestamp:
create 'ns:tbl',{NAME => 'family',VERSIONS => 100}
put 'ns:tbl','row','family:name','John',1000000000000
put 'ns:tbl','row','family:name','Jane',1000000000000
put 'ns:tbl','row','family:name','Gil',1000000000000
put 'ns:tbl','row','family:name','Jane',1000000000000

Now, I create a custom client written in Scala that uses the Java APIs:

import org.apache.hadoop.hbase.filter._
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.client._
import org.apache.hadoop.hbase.{CellUtil, HBaseConfiguration, TableName}
import scala.collection.mutable._

val config = HBaseConfiguration.create()
config.set("hbase.zookeeper.quorum", "localhost")
config.set("hbase.zookeeper.property.clientPort", "2181")

val connection = ConnectionFactory.createConnection(config)

val logicalOp = FilterList.Operator.MUST_PASS_ALL
val limit = 1
var resultsList = ListBuffer[String]()
for (offset <- 0 to 20 by limit) {
            val table = connection.getTable(TableName.valueOf("ns:tbl"))
            val paginationFilter = new ColumnPaginationFilter(limit,offset)
            val filterList: FilterList = new FilterList(logicalOp,paginationFilter)
            val results = table.get(new Get(Bytes.toBytes("row")).setFilter(filterList))
            val cells = results.rawCells()
            if (cells != null) {
                        for (cell <- cells) {
                          val value = new String(CellUtil.cloneValue(cell))
                          val qualifier = new String(CellUtil.cloneQualifier(cell))
                          val family = new String(CellUtil.cloneFamily(cell))
                          val result = "OFFSET = "+offset+":"+family + "," + qualifier + "," + value + "," + cell.getTimestamp()
                          println(result)
                          resultsList.append(result)
                        }
            }
}


My results look like:
limit = 1 & logicalOp = MUST_PASS_ALL:
scala> resultsList.foreach(println)
OFFSET = 0:family,name,Jane,1000000000000

limit = 1 & logicalOp = MUST_PASS_ONE:
scala> resultsList.foreach(println)
OFFSET = 0:family,name,Jane,1000000000000
OFFSET = 1:family,name,Gil,1000000000000
OFFSET = 2:family,name,Jane,1000000000000
OFFSET = 3:family,name,John,1000000000000

limit = 2 & logicalOp = MUST_PASS_ALL:
scala> resultsList.foreach(println)
OFFSET = 0:family,name,Jane,1000000000000

limit = 2 & logicalOp = MUST_PASS_ONE:
scala> resultsList.foreach(println)
OFFSET = 0:family,name,Jane,1000000000000
OFFSET = 2:family,name,Jane,1000000000000


My main question is around why, when using MUST_PASS_ONE, don’t I get back only the single, most-recently-inserted value of the cell as I do when I use MUST_PASS_ALL? Note that if I don’t use a filterList at all and instance just set the get’s filter to the paginationFilter, I get the result I would expect (ie the single OFFSET = 0:family,name,Jane,1000000000000).

The documentation isn’t entirely clear about this situation, and I’m hoping someone on either mailing list may be able to assist.

Best,
Jason
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.