You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "VITALIY SAVCHENKO (JIRA)" <ji...@apache.org> on 2016/10/21 09:37:58 UTC

[jira] [Comment Edited] (HADOOP-13725) Open MapFile for append

    [ https://issues.apache.org/jira/browse/HADOOP-13725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15590692#comment-15590692 ] 

VITALIY SAVCHENKO edited comment on HADOOP-13725 at 10/21/16 9:37 AM:
----------------------------------------------------------------------

{code}
1.
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(0), new IntWritable(100));
            w.append(new IntWritable(10), new IntWritable(200));
            w.append(new IntWritable(5), new IntWritable(400)); //java.io.IOException: key out of order: 5 after 10
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }
        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(0), new IntWritable())); //print 100
        System.out.println(reader.get(new IntWritable(10), new IntWritable())); // print 200 MapFile corrent
        reader.close();
2. Open MapFile for append
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(0), new IntWritable(100));
            w.append(new IntWritable(10), new IntWritable(200));
            w.close();

            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                SequenceFile.Writer.appendIfExists(true), // append to exist MapFile
                SequenceFile.Writer.replication((short)2),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(20), new IntWritable(300));
            w.append(new IntWritable(30), new IntWritable(400));
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }
        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(10), new IntWritable())); //print 200
        System.out.println(reader.get(new IntWritable(20), new IntWritable())); //print 300 MapFile correct
        reader.close();

3. Append to exist MapFile, but incorrect range
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(10), new IntWritable(100));
            w.append(new IntWritable(20), new IntWritable(200));
            w.close();

            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                SequenceFile.Writer.appendIfExists(true), //append to MapFile
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(5), new IntWritable(300)); //No exception here
            w.append(new IntWritable(10), new IntWritable(400));
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }

        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(5), new IntWritable())); // java.io.IOException: key out of order: 5 after 10 - MapFile corrupted
        System.out.println(reader.get(new IntWritable(10), new IntWritable()));
        System.out.println(reader.get(new IntWritable(20), new IntWritable()));
        reader.close();
{code}

Reason: When open MapFile with option SequenceFile.Writer.appendIfExists(true) - writer not read last key from exist MapFile.



was (Author: pioneer.vvs@gmail.com):
{code}
1.
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(0), new IntWritable(100));
            w.append(new IntWritable(10), new IntWritable(200));
            w.append(new IntWritable(5), new IntWritable(400)); //java.io.IOException: key out of order: 5 after 10
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }
        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(0), new IntWritable())); //print 100
        System.out.println(reader.get(new IntWritable(10), new IntWritable())); // print 200 MapFile corrent
        reader.close();
2. Open MapFile for apped
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(0), new IntWritable(100));
            w.append(new IntWritable(10), new IntWritable(200));
            w.close();

            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                SequenceFile.Writer.appendIfExists(true), // append to exist MapFile
                SequenceFile.Writer.replication((short)2),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(20), new IntWritable(300));
            w.append(new IntWritable(30), new IntWritable(400));
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }
        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(10), new IntWritable())); //print 200
        System.out.println(reader.get(new IntWritable(20), new IntWritable())); //print 300 MapFile correct
        reader.close();

3. Append to exist MapFile, but incorrect range
        MapFile.Writer w = null;
        try {
            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(10), new IntWritable(100));
            w.append(new IntWritable(20), new IntWritable(200));
            w.close();

            w = new MapFile.Writer(
                new Configuration(),
                new Path("hdfs://192.168.56.101:9000/20161020/data"),
                SequenceFile.Writer.appendIfExists(true), //append to MapFile
                MapFile.Writer.keyClass(IntWritable.class),
                MapFile.Writer.valueClass(IntWritable.class)
            );
            w.append(new IntWritable(5), new IntWritable(300)); //No exception here
            w.append(new IntWritable(10), new IntWritable(400));
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            w.close();
        }

        MapFile.Reader reader = new MapFile.Reader(
            new Path("hdfs://192.168.56.101:9000/20161020/data"),
            new Configuration()
        );
        System.out.println(reader.get(new IntWritable(5), new IntWritable())); // java.io.IOException: key out of order: 5 after 10 - MapFile corrupted
        System.out.println(reader.get(new IntWritable(10), new IntWritable()));
        System.out.println(reader.get(new IntWritable(20), new IntWritable()));
        reader.close();
{code}

Reason: When open MapFile with option SequenceFile.Writer.appendIfExists(true) - writer not read last key from exist MapFile.


> Open MapFile for append
> -----------------------
>
>                 Key: HADOOP-13725
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13725
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: VITALIY SAVCHENKO
>
> I think it possible to open MapFile for appending.
> SequenceFile support it (Option SequenceFile.Writer.appendIfExists(true) HADOOP-7139)
> Now it almost working. But if use SequenceFile.Writer.appendIfExists(true) MapFile.Writer - it not read last key and does not check new keys. That's why MapFile can be corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org