You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ๏̯͡๏ <ÐΞ€ρ@Ҝ>,
de...@gmail.com on 2013/05/17 10:04:12 UTC
Fwd: How to read data in column wise (readColumnsBatch())
I have the below code that can read data row wise
private void readSpecificColumns() throws IOException, SerDeException {
ColumnarSerDe serDe = new ColumnarSerDe();
serDe.initialize(conf, createProperties());
// long start = System.currentTimeMillis();
java.util.ArrayList<Integer> readCols = new
java.util.ArrayList<Integer>();
readCols.add(Integer.valueOf(2));
readCols.add(Integer.valueOf(3));
ColumnProjectionUtils.setReadColumnIDs(conf, readCols);
RCFile.Reader reader = new RCFile.Reader(fs, rcfile, conf);
LongWritable rowID = new LongWritable();
BytesRefArrayWritable cols = new BytesRefArrayWritable();
while (*reader.next(rowID))* {
* reader.getCurrentRow(cols);*
cols.resetValid(8);
Object row = serDe.deserialize(cols);
StructObjectInspector oi = (StructObjectInspector)
serDe.getObjectInspector();
List<? extends StructField> fieldRefs =
oi.getAllStructFieldRefs();
for (int i : readCols) {
Object fieldData = oi.getStructFieldData(row,
fieldRefs.get(i));
Object standardWritableData = ObjectInspectorUtils.
copyToStandardObject(fieldData, fieldRefs.get(i)
.getFieldObjectInspector(),
ObjectInspectorCopyOption.WRITABLE);
if (standardWritableData != null) {
System.out.println("standardWritableData : " +
standardWritableData.toString());
}
}
}
reader.close();
}
In above code readCols holds 0,3
The output here if i want to read column 0 and 2 is
col0,col3
col0,col3
col0,col3
Q) How i do convert the above code to read column wise , columns 0 and 2. I
replaced reader.next() with reader.readColumnsBatch() and getCurrentRow()
with getColumn(i,cols). I did not get any output.
Q) getCurrentRow() expects a LongWritable. What does it signify ? Suppose i
have 100 rows and if i pass 10, will it retrieve 10th row? In above code
its never modified and holds default value of 0. I went into the code of
getCurrentRow() but could not figure it out.
Regards,
Deepak
--
Deepak