You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Doug Meil (JIRA)" <ji...@apache.org> on 2013/05/17 19:57:16 UTC
[jira] [Resolved] (HBASE-8571) CopyTable and RowCounter don't seem to use setCaching setting

     [ https://issues.apache.org/jira/browse/HBASE-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Meil resolved HBASE-8571.
------------------------------

    Resolution: Won't Fix

Closing this as "won't fix" but I will say that it's just a bit surprising even time I look at it.  

It seems like it might be less confusing to have the Scan's setCaching default setting behavior to be within the Scan class than outside (e.g., formerly in HTable now in ClientScanner)
                
> CopyTable and RowCounter don't seem to use setCaching setting
> -------------------------------------------------------------
>
>                 Key: HBASE-8571
>                 URL: https://issues.apache.org/jira/browse/HBASE-8571
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Doug Meil
>
> Maybe it's just me, but I've been looking on trunk and I don't see where either RowCounter or CopyTable MapReduce can adjust the setCaching setting on the Scan instance.
> Example from RowCounter...
> {code}
>    Job job = new Job(conf, NAME + "_" + tableName);
>     job.setJarByClass(RowCounter.class);
>     Scan scan = new Scan();
>     scan.setCacheBlocks(false);
>     Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR);
>     if (startKey != null && !startKey.equals("")) {
>       scan.setStartRow(Bytes.toBytes(startKey));
>     }
>     if (endKey != null && !endKey.equals("")) {
>       scan.setStopRow(Bytes.toBytes(endKey));
>     }
>     scan.setFilter(new FirstKeyOnlyFilter());
>     if (sb.length() > 0) {
>       for (String columnName : sb.toString().trim().split(" ")) {
>         String [] fields = columnName.split(":");
>         if(fields.length == 1) {
>           scan.addFamily(Bytes.toBytes(fields[0]));
>         } else {
>           byte[] qualifier = Bytes.toBytes(fields[1]);
>           qualifiers.add(qualifier);
>           scan.addColumn(Bytes.toBytes(fields[0]), qualifier);
>         }
>       }
>     }
>     // specified column may or may not be part of first key value for the row.
>     // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use
>     // FirstKeyValueMatchingQualifiersFilter.
>     if (qualifiers.size() == 0) {
>       scan.setFilter(new FirstKeyOnlyFilter());
>     } else {
>       scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers));
>     }
>     job.setOutputFormatClass(NullOutputFormat.class);
>     TableMapReduceUtil.initTableMapperJob(tableName, scan,
>       RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job);
>     job.setNumReduceTasks(0);
>     return job;
> {code}
> TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust any of the settings.
> Maybe I'm missing something, but this seems like a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira