You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Guido Serra aka Zeph (JIRA)" <ji...@apache.org> on 2013/01/22 18:16:13 UTC
[jira] [Created] (HBASE-7645) put without timestamp duplicates the
record/row
Guido Serra aka Zeph created HBASE-7645:
-------------------------------------------
Summary: put without timestamp duplicates the record/row
Key: HBASE-7645
URL: https://issues.apache.org/jira/browse/HBASE-7645
Project: HBase
Issue Type: Brainstorming
Components: Client
Reporter: Guido Serra aka Zeph
if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...
{code}
hbase(main):030:0> get "dump_HKFAS.sales_order", "1", {COLUMN => "mysql:created_at", VERSIONS => 4}
COLUMN CELL
mysql:created_at timestamp=1358853505756, value=2011-12-21 18:07:38.0
mysql:created_at timestamp=1358790515451, value=2011-12-21 18:07:38.0
2 row(s) in 0.0040 seconds
today's sqoop run
hbase(main):031:0> Date.new(1358853505756).toString()
=> "Tue Jan 22 11:18:25 UTC 2013"
yesterday's sqoop run
hbase(main):032:0> Date.new(1358790515451).toString()
=> "Mon Jan 21 17:48:35 UTC 2013"
{code}
the fact that the Put.add() method writes the kv without checking if, apart of the timestamp, the value has not changed, is it by design? or a bug?
from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}
/**
* Add the specified column and value to this Put operation.
* @param family family name
* @param qualifier column qualifier
* @param value column value
* @return this
*/
public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
}
/**
* Add the specified column and value, with the specified timestamp as
* its version to this Put operation.
* @param family family name
* @param qualifier column qualifier
* @param ts version timestamp
* @param value column value
* @return this
*/
public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
List<KeyValue> list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
}
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira