You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2013/05/17 10:04:12 UTC

Fwd: How to read data in column wise (readColumnsBatch())

I have the below code that can read data row wise


private void readSpecificColumns() throws IOException, SerDeException {
        ColumnarSerDe serDe = new ColumnarSerDe();
        serDe.initialize(conf, createProperties());

        // long start = System.currentTimeMillis();
        java.util.ArrayList<Integer> readCols = new
java.util.ArrayList<Integer>();
        readCols.add(Integer.valueOf(2));
        readCols.add(Integer.valueOf(3));
        ColumnProjectionUtils.setReadColumnIDs(conf, readCols);
        RCFile.Reader reader = new RCFile.Reader(fs, rcfile, conf);

        LongWritable rowID = new LongWritable();
        BytesRefArrayWritable cols = new BytesRefArrayWritable();

        while (*reader.next(rowID))* {
           * reader.getCurrentRow(cols);*
            cols.resetValid(8);
            Object row = serDe.deserialize(cols);

            StructObjectInspector oi = (StructObjectInspector)
serDe.getObjectInspector();
            List<? extends StructField> fieldRefs =
oi.getAllStructFieldRefs();
            for (int i : readCols) {
                Object fieldData = oi.getStructFieldData(row,
fieldRefs.get(i));
                Object standardWritableData = ObjectInspectorUtils.
copyToStandardObject(fieldData, fieldRefs.get(i)
                        .getFieldObjectInspector(),
ObjectInspectorCopyOption.WRITABLE);
                if (standardWritableData != null) {
                    System.out.println("standardWritableData : " +
standardWritableData.toString());
                }
            }
        }
        reader.close();
    }

In above code readCols holds 0,3

The output here if i want to read column 0 and 2 is
col0,col3
col0,col3
col0,col3


Q) How i do convert the above code to read column wise , columns 0 and 2. I
replaced reader.next() with reader.readColumnsBatch() and  getCurrentRow()
with getColumn(i,cols). I did not get any output.
Q) getCurrentRow() expects a LongWritable. What does it signify ? Suppose i
have 100 rows and if i pass 10, will it retrieve 10th row? In above code
its never modified and holds default value of 0. I went into the code of
getCurrentRow() but could not figure it out.

Regards,
Deepak






-- 
Deepak