You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@orc.apache.org by István <le...@gmail.com> on 2016/01/07 21:23:20 UTC

Writing ORC files without HDFS

Hi all,

I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.

I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:

https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java

Am I on the right track implementing this?

Let me know if you have any suggestions or links in the subject.

Thank you very much,
Istvan

-- 
the sun shines for all

Re: Writing ORC files without HDFS

Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Oh ok. Thanks for the info.

Regards,
Ravi



From:   István <le...@gmail.com>
To:     user@orc.apache.org
Date:   01/13/2016 08:23 PM
Subject:        Re: Writing ORC files without HDFS



Hi Ravi,

I got the code working here:

https://github.com/StreamBright/orcdemo/blob/master/src/main/java/org/streambright/orcdemo/App.java

It seems the OrcFile.createWriter takes a path on the local filesystem and 
there is no need for FileSystem.getLocal.

Regards,
Istvan

On Mon, Jan 11, 2016 at 10:46 AM, Ravi Tatapudi <ra...@in.ibm.com> 
wrote:
Yes. I think, including the below example-code, to ORC-documentation would 
be useful (for test purposes...etc).

Regards,
Ravi




From:        Lefty Leverenz <le...@gmail.com>
To:        user@orc.apache.org
Date:        01/11/2016 03:08 PM
Subject:        Re: Writing ORC files without HDFS




Should this be included in the ORC documentation?

-- Lefty

On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
Hi Ravi,

Excellent response, thank you very much, this is exactly I was looking 
for!

Best regards,
Istvan

On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com> 
wrote:
Hello,

You can write ORC-files on local-filesystem, by getting the local 
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the 
simple-example given below & see if it works for your requirement.

=================================================
public class orcw {

        private static Configuration conf = new Configuration();
        public static Writer writer ;

public static class OrcRow
{
        int col1 ;
        String col2 ;
        String col3 ;

        OrcRow(int a, String b, String c) {
        this.col1  = a ; 
        this.col2  = b ; 
        this.col3  = c ; 
        }
}

public static void main(String[] args) throws IOException,

    InterruptedException, ClassNotFoundException {

            String path = "/tmp/orcfile1";

            try {

            conf = new Configuration();
        FileSystem fs = FileSystem.getLocal(conf);

            ObjectInspector ObjInspector = 
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class, 
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
            writer = OrcFile.createWriter(new Path(path), 
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));

        writer.addRow(new OrcRow(1,"hello","orcFile")) ;
        writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;

            writer.close();
            } 
            catch (Exception e)
            {
                    e.printStackTrace();
            }
    }
}
=================================================

Thanks,
 Ravi



From:        István <le...@gmail.com>
To:        user@orc.apache.org
Date:        01/08/2016 01:53 AM
Subject:        Writing ORC files without HDFS




Hi all,

I am working on a project that requires me to write ORC files locally on a 
non-HDFS location. I was wondering if there is any project doing something 
similar, but I guess there is none, after spending some time on Google.

I think what needs to get done is to re-implement the ORC Writer, sort of 
similar to the following but leaving out Hadoop:

https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java


Am I on the right track implementing this? 

Let me know if you have any suggestions or links in the subject.

Thank you very much,
Istvan

-- 
the sun shines for all






-- 
the sun shines for all







-- 
the sun shines for all





Re: Writing ORC files without HDFS

Posted by István <le...@gmail.com>.
Hi Ravi,

I got the code working here:

https://github.com/StreamBright/orcdemo/blob/master/src/main/java/org/streambright/orcdemo/App.java

It seems the OrcFile.createWriter takes a path on the local filesystem and
there is no need for FileSystem.getLocal.

Regards,
Istvan

On Mon, Jan 11, 2016 at 10:46 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:

> Yes. I think, including the below example-code, to ORC-documentation would
> be useful (for test purposes...etc).
>
> Regards,
> Ravi
>
>
>
>
> From:        Lefty Leverenz <le...@gmail.com>
> To:        user@orc.apache.org
> Date:        01/11/2016 03:08 PM
> Subject:        Re: Writing ORC files without HDFS
>
> ------------------------------
>
>
>
> Should this be included in the ORC documentation?
>
> -- Lefty
>
> On Fri, Jan 8, 2016 at 2:33 PM, István <*leccine@gmail.com*
> <le...@gmail.com>> wrote:
> Hi Ravi,
>
> Excellent response, thank you very much, this is exactly I was looking for!
>
> Best regards,
> Istvan
>
> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <*ravi_tatapudi@in.ibm.com*
> <ra...@in.ibm.com>> wrote:
> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
>         private static Configuration conf = new Configuration();
>         public static Writer writer ;
>
> public static class OrcRow
> {
>         int col1 ;
>         String col2 ;
>         String col3 ;
>
>         OrcRow(int a, String b, String c) {
>         this.col1  = a ;
>         this.col2  = b ;
>         this.col3  = c ;
>         }
> }
>
> public static void main(String[] args) throws IOException,
>
>     InterruptedException, ClassNotFoundException {
>
>             String path = "/tmp/orcfile1";
>
>             try {
>
>             conf = new Configuration();
>         FileSystem fs = FileSystem.getLocal(conf);
>
>             ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>             writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
>         writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>         writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
>             writer.close();
>             }
>             catch (Exception e)
>             {
>                     e.printStackTrace();
>             }
>     }
> }
> =================================================
>
> Thanks,
>  Ravi
>
>
>
> From:        István <*leccine@gmail.com* <le...@gmail.com>>
> To:        *user@orc.apache.org* <us...@orc.apache.org>
> Date:        01/08/2016 01:53 AM
> Subject:        Writing ORC files without HDFS
> ------------------------------
>
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>
>
>
> --
> the sun shines for all
>
>
>
>
>


-- 
the sun shines for all

Re: Writing ORC files without HDFS

Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Yes. I think, including the below example-code, to ORC-documentation would 
be useful (for test purposes...etc).

Regards,
Ravi




From:   Lefty Leverenz <le...@gmail.com>
To:     user@orc.apache.org
Date:   01/11/2016 03:08 PM
Subject:        Re: Writing ORC files without HDFS



Should this be included in the ORC documentation?

-- Lefty

On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
Hi Ravi,

Excellent response, thank you very much, this is exactly I was looking 
for!

Best regards,
Istvan

On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com> 
wrote:
Hello,

You can write ORC-files on local-filesystem, by getting the local 
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the 
simple-example given below & see if it works for your requirement.

=================================================
public class orcw {

        private static Configuration conf = new Configuration();
        public static Writer writer ;

public static class OrcRow
{
        int col1 ;
        String col2 ;
        String col3 ;

        OrcRow(int a, String b, String c) {
        this.col1  = a ; 
        this.col2  = b ; 
        this.col3  = c ; 
        }
}

public static void main(String[] args) throws IOException,

    InterruptedException, ClassNotFoundException {

            String path = "/tmp/orcfile1";

            try {

            conf = new Configuration();
        FileSystem fs = FileSystem.getLocal(conf);

            ObjectInspector ObjInspector = 
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class, 
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
            writer = OrcFile.createWriter(new Path(path), 
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));


        writer.addRow(new OrcRow(1,"hello","orcFile")) ;
        writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;

            writer.close();
            } 
            catch (Exception e)
            {
                    e.printStackTrace();
            }
    }
}
=================================================

Thanks,
 Ravi



From:        István <le...@gmail.com>
To:        user@orc.apache.org
Date:        01/08/2016 01:53 AM
Subject:        Writing ORC files without HDFS




Hi all,

I am working on a project that requires me to write ORC files locally on a 
non-HDFS location. I was wondering if there is any project doing something 
similar, but I guess there is none, after spending some time on Google.

I think what needs to get done is to re-implement the ORC Writer, sort of 
similar to the following but leaving out Hadoop:

https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java


Am I on the right track implementing this? 

Let me know if you have any suggestions or links in the subject.

Thank you very much,
Istvan

-- 
the sun shines for all






-- 
the sun shines for all






Re: Writing ORC files without HDFS

Posted by István <le...@gmail.com>.
I have created a small project with this code, if I can add it to the
documentation I would.

Regards,
Istvan

On Mon, Jan 11, 2016 at 10:38 AM, Lefty Leverenz <le...@gmail.com>
wrote:

> Should this be included in the ORC documentation?
>
> -- Lefty
>
> On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
>
>> Hi Ravi,
>>
>> Excellent response, thank you very much, this is exactly I was looking
>> for!
>>
>> Best regards,
>> Istvan
>>
>> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
>> wrote:
>>
>>> Hello,
>>>
>>> You can write ORC-files on local-filesystem, by getting the local
>>> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
>>> simple-example given below & see if it works for your requirement.
>>>
>>> =================================================
>>> public class orcw {
>>>
>>>         private static Configuration conf = new Configuration();
>>>         public static Writer writer ;
>>>
>>> public static class OrcRow
>>> {
>>>         int col1 ;
>>>         String col2 ;
>>>         String col3 ;
>>>
>>>         OrcRow(int a, String b, String c) {
>>>         this.col1  = a ;
>>>         this.col2  = b ;
>>>         this.col3  = c ;
>>>         }
>>> }
>>>
>>> public static void main(String[] args) throws IOException,
>>>
>>>     InterruptedException, ClassNotFoundException {
>>>
>>>             String path = "/tmp/orcfile1";
>>>
>>>             try {
>>>
>>>             conf = new Configuration();
>>>         FileSystem fs = FileSystem.getLocal(conf);
>>>
>>>             ObjectInspector ObjInspector =
>>> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
>>> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>>>             writer = OrcFile.createWriter(new Path(path),
>>> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>>>
>>>         writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>>>         writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>>>
>>>             writer.close();
>>>             }
>>>             catch (Exception e)
>>>             {
>>>                     e.printStackTrace();
>>>             }
>>>     }
>>> }
>>> =================================================
>>>
>>> Thanks,
>>>  Ravi
>>>
>>>
>>>
>>> From:        István <le...@gmail.com>
>>> To:        user@orc.apache.org
>>> Date:        01/08/2016 01:53 AM
>>> Subject:        Writing ORC files without HDFS
>>> ------------------------------
>>>
>>>
>>>
>>> Hi all,
>>>
>>> I am working on a project that requires me to write ORC files locally on
>>> a non-HDFS location. I was wondering if there is any project doing
>>> something similar, but I guess there is none, after spending some time on
>>> Google.
>>>
>>> I think what needs to get done is to re-implement the ORC Writer, sort
>>> of similar to the following but leaving out Hadoop:
>>>
>>>
>>> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
>>> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>>>
>>> Am I on the right track implementing this?
>>>
>>> Let me know if you have any suggestions or links in the subject.
>>>
>>> Thank you very much,
>>> Istvan
>>>
>>> --
>>> the sun shines for all
>>>
>>>
>>>
>>>
>>
>>
>> --
>> the sun shines for all
>>
>>
>>
>


-- 
the sun shines for all

Re: Writing ORC files without HDFS

Posted by Lefty Leverenz <le...@gmail.com>.
Should this be included in the ORC documentation?

-- Lefty

On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:

> Hi Ravi,
>
> Excellent response, thank you very much, this is exactly I was looking for!
>
> Best regards,
> Istvan
>
> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
> wrote:
>
>> Hello,
>>
>> You can write ORC-files on local-filesystem, by getting the local
>> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
>> simple-example given below & see if it works for your requirement.
>>
>> =================================================
>> public class orcw {
>>
>>         private static Configuration conf = new Configuration();
>>         public static Writer writer ;
>>
>> public static class OrcRow
>> {
>>         int col1 ;
>>         String col2 ;
>>         String col3 ;
>>
>>         OrcRow(int a, String b, String c) {
>>         this.col1  = a ;
>>         this.col2  = b ;
>>         this.col3  = c ;
>>         }
>> }
>>
>> public static void main(String[] args) throws IOException,
>>
>>     InterruptedException, ClassNotFoundException {
>>
>>             String path = "/tmp/orcfile1";
>>
>>             try {
>>
>>             conf = new Configuration();
>>         FileSystem fs = FileSystem.getLocal(conf);
>>
>>             ObjectInspector ObjInspector =
>> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
>> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>>             writer = OrcFile.createWriter(new Path(path),
>> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>>
>>         writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>>         writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>>
>>             writer.close();
>>             }
>>             catch (Exception e)
>>             {
>>                     e.printStackTrace();
>>             }
>>     }
>> }
>> =================================================
>>
>> Thanks,
>>  Ravi
>>
>>
>>
>> From:        István <le...@gmail.com>
>> To:        user@orc.apache.org
>> Date:        01/08/2016 01:53 AM
>> Subject:        Writing ORC files without HDFS
>> ------------------------------
>>
>>
>>
>> Hi all,
>>
>> I am working on a project that requires me to write ORC files locally on
>> a non-HDFS location. I was wondering if there is any project doing
>> something similar, but I guess there is none, after spending some time on
>> Google.
>>
>> I think what needs to get done is to re-implement the ORC Writer, sort of
>> similar to the following but leaving out Hadoop:
>>
>>
>> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
>> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>>
>> Am I on the right track implementing this?
>>
>> Let me know if you have any suggestions or links in the subject.
>>
>> Thank you very much,
>> Istvan
>>
>> --
>> the sun shines for all
>>
>>
>>
>>
>
>
> --
> the sun shines for all
>
>
>

Re: Writing ORC files without HDFS

Posted by István <le...@gmail.com>.
Hi Ravi,

Excellent response, thank you very much, this is exactly I was looking for!

Best regards,
Istvan

On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:

> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
>         private static Configuration conf = new Configuration();
>         public static Writer writer ;
>
> public static class OrcRow
> {
>         int col1 ;
>         String col2 ;
>         String col3 ;
>
>         OrcRow(int a, String b, String c) {
>         this.col1  = a ;
>         this.col2  = b ;
>         this.col3  = c ;
>         }
> }
>
> public static void main(String[] args) throws IOException,
>
>     InterruptedException, ClassNotFoundException {
>
>             String path = "/tmp/orcfile1";
>
>             try {
>
>             conf = new Configuration();
>         FileSystem fs = FileSystem.getLocal(conf);
>
>             ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>             writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
>         writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>         writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
>             writer.close();
>             }
>             catch (Exception e)
>             {
>                     e.printStackTrace();
>             }
>     }
> }
> =================================================
>
> Thanks,
>  Ravi
>
>
>
> From:        István <le...@gmail.com>
> To:        user@orc.apache.org
> Date:        01/08/2016 01:53 AM
> Subject:        Writing ORC files without HDFS
> ------------------------------
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>


-- 
the sun shines for all

Re: Writing ORC files without HDFS

Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Yes. "hive" project is the right candidate, for this work.

Thanks,
 Ravi




From:   István <le...@gmail.com>
To:     user@orc.apache.org
Date:   01/13/2016 07:28 PM
Subject:        Re: Writing ORC files without HDFS



Ravi,

one final question: which package should I import into my project?

I found this as the best candidate:

<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.0</version>
</dependency

Thanks,
Istvan

On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com> 
wrote:
Hello,

You can write ORC-files on local-filesystem, by getting the local 
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the 
simple-example given below & see if it works for your requirement.

=================================================
public class orcw {

        private static Configuration conf = new Configuration();
        public static Writer writer ;

public static class OrcRow
{
        int col1 ;
        String col2 ;
        String col3 ;

        OrcRow(int a, String b, String c) {
        this.col1  = a ; 
        this.col2  = b ; 
        this.col3  = c ; 
        }
}

public static void main(String[] args) throws IOException,

    InterruptedException, ClassNotFoundException {

            String path = "/tmp/orcfile1";

            try {

            conf = new Configuration();
        FileSystem fs = FileSystem.getLocal(conf);

            ObjectInspector ObjInspector = 
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class, 
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
            writer = OrcFile.createWriter(new Path(path), 
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));


        writer.addRow(new OrcRow(1,"hello","orcFile")) ;
        writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;

            writer.close();
            } 
            catch (Exception e)
            {
                    e.printStackTrace();
            }
    }
}
=================================================

Thanks,
 Ravi



From:        István <le...@gmail.com>
To:        user@orc.apache.org
Date:        01/08/2016 01:53 AM
Subject:        Writing ORC files without HDFS




Hi all,

I am working on a project that requires me to write ORC files locally on a 
non-HDFS location. I was wondering if there is any project doing something 
similar, but I guess there is none, after spending some time on Google.

I think what needs to get done is to re-implement the ORC Writer, sort of 
similar to the following but leaving out Hadoop:

https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java


Am I on the right track implementing this? 

Let me know if you have any suggestions or links in the subject.

Thank you very much,
Istvan

-- 
the sun shines for all






-- 
the sun shines for all





Re: Writing ORC files without HDFS

Posted by István <le...@gmail.com>.
Ravi,

one final question: which package should I import into my project?

I found this as the best candidate:

<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.0</version>
</dependency

Thanks,
Istvan

On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:

> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
>         private static Configuration conf = new Configuration();
>         public static Writer writer ;
>
> public static class OrcRow
> {
>         int col1 ;
>         String col2 ;
>         String col3 ;
>
>         OrcRow(int a, String b, String c) {
>         this.col1  = a ;
>         this.col2  = b ;
>         this.col3  = c ;
>         }
> }
>
> public static void main(String[] args) throws IOException,
>
>     InterruptedException, ClassNotFoundException {
>
>             String path = "/tmp/orcfile1";
>
>             try {
>
>             conf = new Configuration();
>         FileSystem fs = FileSystem.getLocal(conf);
>
>             ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>             writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
>         writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>         writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
>             writer.close();
>             }
>             catch (Exception e)
>             {
>                     e.printStackTrace();
>             }
>     }
> }
> =================================================
>
> Thanks,
>  Ravi
>
>
>
> From:        István <le...@gmail.com>
> To:        user@orc.apache.org
> Date:        01/08/2016 01:53 AM
> Subject:        Writing ORC files without HDFS
> ------------------------------
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>


-- 
the sun shines for all

Re: Writing ORC files without HDFS

Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Hello,

You can write ORC-files on local-filesystem, by getting the local 
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the 
simple-example given below & see if it works for your requirement.

=================================================
public class orcw {

        private static Configuration conf = new Configuration();
        public static Writer writer ;

public static class OrcRow
{
        int col1 ;
        String col2 ;
        String col3 ;

        OrcRow(int a, String b, String c) {
        this.col1  = a ; 
        this.col2  = b ; 
        this.col3  = c ; 
        }
}

public static void main(String[] args) throws IOException,

    InterruptedException, ClassNotFoundException {

        String path = "/tmp/orcfile1";

        try {

        conf = new Configuration();
        FileSystem fs = FileSystem.getLocal(conf);

        ObjectInspector ObjInspector = 
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class, 
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
        writer = OrcFile.createWriter(new Path(path), 
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));

        writer.addRow(new OrcRow(1,"hello","orcFile")) ;
        writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;

        writer.close();
        } 
        catch (Exception e)
        {
                e.printStackTrace();
        }
    }
}
=================================================

Thanks,
 Ravi



From:   István <le...@gmail.com>
To:     user@orc.apache.org
Date:   01/08/2016 01:53 AM
Subject:        Writing ORC files without HDFS



Hi all,

I am working on a project that requires me to write ORC files locally on a 
non-HDFS location. I was wondering if there is any project doing something 
similar, but I guess there is none, after spending some time on Google.

I think what needs to get done is to re-implement the ORC Writer, sort of 
similar to the following but leaving out Hadoop:

https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java

Am I on the right track implementing this? 

Let me know if you have any suggestions or links in the subject.

Thank you very much,
Istvan

-- 
the sun shines for all