You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@orc.apache.org by István <le...@gmail.com> on 2016/01/07 21:23:20 UTC
Writing ORC files without HDFS
Hi all,
I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.
I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:
https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java
Am I on the right track implementing this?
Let me know if you have any suggestions or links in the subject.
Thank you very much,
Istvan
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Oh ok. Thanks for the info.
Regards,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/13/2016 08:23 PM
Subject: Re: Writing ORC files without HDFS
Hi Ravi,
I got the code working here:
https://github.com/StreamBright/orcdemo/blob/master/src/main/java/org/streambright/orcdemo/App.java
It seems the OrcFile.createWriter takes a path on the local filesystem and
there is no need for FileSystem.getLocal.
Regards,
Istvan
On Mon, Jan 11, 2016 at 10:46 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
Yes. I think, including the below example-code, to ORC-documentation would
be useful (for test purposes...etc).
Regards,
Ravi
From: Lefty Leverenz <le...@gmail.com>
To: user@orc.apache.org
Date: 01/11/2016 03:08 PM
Subject: Re: Writing ORC files without HDFS
Should this be included in the ORC documentation?
-- Lefty
On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
Hi Ravi,
Excellent response, thank you very much, this is exactly I was looking
for!
Best regards,
Istvan
On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
Hello,
You can write ORC-files on local-filesystem, by getting the local
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
simple-example given below & see if it works for your requirement.
=================================================
public class orcw {
private static Configuration conf = new Configuration();
public static Writer writer ;
public static class OrcRow
{
int col1 ;
String col2 ;
String col3 ;
OrcRow(int a, String b, String c) {
this.col1 = a ;
this.col2 = b ;
this.col3 = c ;
}
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
String path = "/tmp/orcfile1";
try {
conf = new Configuration();
FileSystem fs = FileSystem.getLocal(conf);
ObjectInspector ObjInspector =
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
writer = OrcFile.createWriter(new Path(path),
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
writer.addRow(new OrcRow(1,"hello","orcFile")) ;
writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
writer.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
=================================================
Thanks,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/08/2016 01:53 AM
Subject: Writing ORC files without HDFS
Hi all,
I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.
I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:
https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java
Am I on the right track implementing this?
Let me know if you have any suggestions or links in the subject.
Thank you very much,
Istvan
--
the sun shines for all
--
the sun shines for all
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by István <le...@gmail.com>.
Hi Ravi,
I got the code working here:
https://github.com/StreamBright/orcdemo/blob/master/src/main/java/org/streambright/orcdemo/App.java
It seems the OrcFile.createWriter takes a path on the local filesystem and
there is no need for FileSystem.getLocal.
Regards,
Istvan
On Mon, Jan 11, 2016 at 10:46 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
> Yes. I think, including the below example-code, to ORC-documentation would
> be useful (for test purposes...etc).
>
> Regards,
> Ravi
>
>
>
>
> From: Lefty Leverenz <le...@gmail.com>
> To: user@orc.apache.org
> Date: 01/11/2016 03:08 PM
> Subject: Re: Writing ORC files without HDFS
>
> ------------------------------
>
>
>
> Should this be included in the ORC documentation?
>
> -- Lefty
>
> On Fri, Jan 8, 2016 at 2:33 PM, István <*leccine@gmail.com*
> <le...@gmail.com>> wrote:
> Hi Ravi,
>
> Excellent response, thank you very much, this is exactly I was looking for!
>
> Best regards,
> Istvan
>
> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <*ravi_tatapudi@in.ibm.com*
> <ra...@in.ibm.com>> wrote:
> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
> private static Configuration conf = new Configuration();
> public static Writer writer ;
>
> public static class OrcRow
> {
> int col1 ;
> String col2 ;
> String col3 ;
>
> OrcRow(int a, String b, String c) {
> this.col1 = a ;
> this.col2 = b ;
> this.col3 = c ;
> }
> }
>
> public static void main(String[] args) throws IOException,
>
> InterruptedException, ClassNotFoundException {
>
> String path = "/tmp/orcfile1";
>
> try {
>
> conf = new Configuration();
> FileSystem fs = FileSystem.getLocal(conf);
>
> ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
> writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
> writer.addRow(new OrcRow(1,"hello","orcFile")) ;
> writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
> writer.close();
> }
> catch (Exception e)
> {
> e.printStackTrace();
> }
> }
> }
> =================================================
>
> Thanks,
> Ravi
>
>
>
> From: István <*leccine@gmail.com* <le...@gmail.com>>
> To: *user@orc.apache.org* <us...@orc.apache.org>
> Date: 01/08/2016 01:53 AM
> Subject: Writing ORC files without HDFS
> ------------------------------
>
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>
>
>
> --
> the sun shines for all
>
>
>
>
>
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Yes. I think, including the below example-code, to ORC-documentation would
be useful (for test purposes...etc).
Regards,
Ravi
From: Lefty Leverenz <le...@gmail.com>
To: user@orc.apache.org
Date: 01/11/2016 03:08 PM
Subject: Re: Writing ORC files without HDFS
Should this be included in the ORC documentation?
-- Lefty
On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
Hi Ravi,
Excellent response, thank you very much, this is exactly I was looking
for!
Best regards,
Istvan
On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
Hello,
You can write ORC-files on local-filesystem, by getting the local
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
simple-example given below & see if it works for your requirement.
=================================================
public class orcw {
private static Configuration conf = new Configuration();
public static Writer writer ;
public static class OrcRow
{
int col1 ;
String col2 ;
String col3 ;
OrcRow(int a, String b, String c) {
this.col1 = a ;
this.col2 = b ;
this.col3 = c ;
}
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
String path = "/tmp/orcfile1";
try {
conf = new Configuration();
FileSystem fs = FileSystem.getLocal(conf);
ObjectInspector ObjInspector =
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
writer = OrcFile.createWriter(new Path(path),
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
writer.addRow(new OrcRow(1,"hello","orcFile")) ;
writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
writer.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
=================================================
Thanks,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/08/2016 01:53 AM
Subject: Writing ORC files without HDFS
Hi all,
I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.
I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:
https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java
Am I on the right track implementing this?
Let me know if you have any suggestions or links in the subject.
Thank you very much,
Istvan
--
the sun shines for all
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by István <le...@gmail.com>.
I have created a small project with this code, if I can add it to the
documentation I would.
Regards,
Istvan
On Mon, Jan 11, 2016 at 10:38 AM, Lefty Leverenz <le...@gmail.com>
wrote:
> Should this be included in the ORC documentation?
>
> -- Lefty
>
> On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
>
>> Hi Ravi,
>>
>> Excellent response, thank you very much, this is exactly I was looking
>> for!
>>
>> Best regards,
>> Istvan
>>
>> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
>> wrote:
>>
>>> Hello,
>>>
>>> You can write ORC-files on local-filesystem, by getting the local
>>> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
>>> simple-example given below & see if it works for your requirement.
>>>
>>> =================================================
>>> public class orcw {
>>>
>>> private static Configuration conf = new Configuration();
>>> public static Writer writer ;
>>>
>>> public static class OrcRow
>>> {
>>> int col1 ;
>>> String col2 ;
>>> String col3 ;
>>>
>>> OrcRow(int a, String b, String c) {
>>> this.col1 = a ;
>>> this.col2 = b ;
>>> this.col3 = c ;
>>> }
>>> }
>>>
>>> public static void main(String[] args) throws IOException,
>>>
>>> InterruptedException, ClassNotFoundException {
>>>
>>> String path = "/tmp/orcfile1";
>>>
>>> try {
>>>
>>> conf = new Configuration();
>>> FileSystem fs = FileSystem.getLocal(conf);
>>>
>>> ObjectInspector ObjInspector =
>>> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
>>> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>>> writer = OrcFile.createWriter(new Path(path),
>>> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>>>
>>> writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>>> writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>>>
>>> writer.close();
>>> }
>>> catch (Exception e)
>>> {
>>> e.printStackTrace();
>>> }
>>> }
>>> }
>>> =================================================
>>>
>>> Thanks,
>>> Ravi
>>>
>>>
>>>
>>> From: István <le...@gmail.com>
>>> To: user@orc.apache.org
>>> Date: 01/08/2016 01:53 AM
>>> Subject: Writing ORC files without HDFS
>>> ------------------------------
>>>
>>>
>>>
>>> Hi all,
>>>
>>> I am working on a project that requires me to write ORC files locally on
>>> a non-HDFS location. I was wondering if there is any project doing
>>> something similar, but I guess there is none, after spending some time on
>>> Google.
>>>
>>> I think what needs to get done is to re-implement the ORC Writer, sort
>>> of similar to the following but leaving out Hadoop:
>>>
>>>
>>> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
>>> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>>>
>>> Am I on the right track implementing this?
>>>
>>> Let me know if you have any suggestions or links in the subject.
>>>
>>> Thank you very much,
>>> Istvan
>>>
>>> --
>>> the sun shines for all
>>>
>>>
>>>
>>>
>>
>>
>> --
>> the sun shines for all
>>
>>
>>
>
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by Lefty Leverenz <le...@gmail.com>.
Should this be included in the ORC documentation?
-- Lefty
On Fri, Jan 8, 2016 at 2:33 PM, István <le...@gmail.com> wrote:
> Hi Ravi,
>
> Excellent response, thank you very much, this is exactly I was looking for!
>
> Best regards,
> Istvan
>
> On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
> wrote:
>
>> Hello,
>>
>> You can write ORC-files on local-filesystem, by getting the local
>> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
>> simple-example given below & see if it works for your requirement.
>>
>> =================================================
>> public class orcw {
>>
>> private static Configuration conf = new Configuration();
>> public static Writer writer ;
>>
>> public static class OrcRow
>> {
>> int col1 ;
>> String col2 ;
>> String col3 ;
>>
>> OrcRow(int a, String b, String c) {
>> this.col1 = a ;
>> this.col2 = b ;
>> this.col3 = c ;
>> }
>> }
>>
>> public static void main(String[] args) throws IOException,
>>
>> InterruptedException, ClassNotFoundException {
>>
>> String path = "/tmp/orcfile1";
>>
>> try {
>>
>> conf = new Configuration();
>> FileSystem fs = FileSystem.getLocal(conf);
>>
>> ObjectInspector ObjInspector =
>> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
>> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
>> writer = OrcFile.createWriter(new Path(path),
>> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>>
>> writer.addRow(new OrcRow(1,"hello","orcFile")) ;
>> writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>>
>> writer.close();
>> }
>> catch (Exception e)
>> {
>> e.printStackTrace();
>> }
>> }
>> }
>> =================================================
>>
>> Thanks,
>> Ravi
>>
>>
>>
>> From: István <le...@gmail.com>
>> To: user@orc.apache.org
>> Date: 01/08/2016 01:53 AM
>> Subject: Writing ORC files without HDFS
>> ------------------------------
>>
>>
>>
>> Hi all,
>>
>> I am working on a project that requires me to write ORC files locally on
>> a non-HDFS location. I was wondering if there is any project doing
>> something similar, but I guess there is none, after spending some time on
>> Google.
>>
>> I think what needs to get done is to re-implement the ORC Writer, sort of
>> similar to the following but leaving out Hadoop:
>>
>>
>> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
>> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>>
>> Am I on the right track implementing this?
>>
>> Let me know if you have any suggestions or links in the subject.
>>
>> Thank you very much,
>> Istvan
>>
>> --
>> the sun shines for all
>>
>>
>>
>>
>
>
> --
> the sun shines for all
>
>
>
Re: Writing ORC files without HDFS
Posted by István <le...@gmail.com>.
Hi Ravi,
Excellent response, thank you very much, this is exactly I was looking for!
Best regards,
Istvan
On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
> private static Configuration conf = new Configuration();
> public static Writer writer ;
>
> public static class OrcRow
> {
> int col1 ;
> String col2 ;
> String col3 ;
>
> OrcRow(int a, String b, String c) {
> this.col1 = a ;
> this.col2 = b ;
> this.col3 = c ;
> }
> }
>
> public static void main(String[] args) throws IOException,
>
> InterruptedException, ClassNotFoundException {
>
> String path = "/tmp/orcfile1";
>
> try {
>
> conf = new Configuration();
> FileSystem fs = FileSystem.getLocal(conf);
>
> ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
> writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
> writer.addRow(new OrcRow(1,"hello","orcFile")) ;
> writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
> writer.close();
> }
> catch (Exception e)
> {
> e.printStackTrace();
> }
> }
> }
> =================================================
>
> Thanks,
> Ravi
>
>
>
> From: István <le...@gmail.com>
> To: user@orc.apache.org
> Date: 01/08/2016 01:53 AM
> Subject: Writing ORC files without HDFS
> ------------------------------
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Yes. "hive" project is the right candidate, for this work.
Thanks,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/13/2016 07:28 PM
Subject: Re: Writing ORC files without HDFS
Ravi,
one final question: which package should I import into my project?
I found this as the best candidate:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.0</version>
</dependency
Thanks,
Istvan
On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
Hello,
You can write ORC-files on local-filesystem, by getting the local
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
simple-example given below & see if it works for your requirement.
=================================================
public class orcw {
private static Configuration conf = new Configuration();
public static Writer writer ;
public static class OrcRow
{
int col1 ;
String col2 ;
String col3 ;
OrcRow(int a, String b, String c) {
this.col1 = a ;
this.col2 = b ;
this.col3 = c ;
}
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
String path = "/tmp/orcfile1";
try {
conf = new Configuration();
FileSystem fs = FileSystem.getLocal(conf);
ObjectInspector ObjInspector =
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
writer = OrcFile.createWriter(new Path(path),
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
writer.addRow(new OrcRow(1,"hello","orcFile")) ;
writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
writer.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
=================================================
Thanks,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/08/2016 01:53 AM
Subject: Writing ORC files without HDFS
Hi all,
I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.
I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:
https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java
Am I on the right track implementing this?
Let me know if you have any suggestions or links in the subject.
Thank you very much,
Istvan
--
the sun shines for all
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by István <le...@gmail.com>.
Ravi,
one final question: which package should I import into my project?
I found this as the best candidate:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.0</version>
</dependency
Thanks,
Istvan
On Fri, Jan 8, 2016 at 8:38 AM, Ravi Tatapudi <ra...@in.ibm.com>
wrote:
> Hello,
>
> You can write ORC-files on local-filesystem, by getting the local
> "FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
> simple-example given below & see if it works for your requirement.
>
> =================================================
> public class orcw {
>
> private static Configuration conf = new Configuration();
> public static Writer writer ;
>
> public static class OrcRow
> {
> int col1 ;
> String col2 ;
> String col3 ;
>
> OrcRow(int a, String b, String c) {
> this.col1 = a ;
> this.col2 = b ;
> this.col3 = c ;
> }
> }
>
> public static void main(String[] args) throws IOException,
>
> InterruptedException, ClassNotFoundException {
>
> String path = "/tmp/orcfile1";
>
> try {
>
> conf = new Configuration();
> FileSystem fs = FileSystem.getLocal(conf);
>
> ObjectInspector ObjInspector =
> ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
> ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
> writer = OrcFile.createWriter(new Path(path),
> OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
>
> writer.addRow(new OrcRow(1,"hello","orcFile")) ;
> writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
>
> writer.close();
> }
> catch (Exception e)
> {
> e.printStackTrace();
> }
> }
> }
> =================================================
>
> Thanks,
> Ravi
>
>
>
> From: István <le...@gmail.com>
> To: user@orc.apache.org
> Date: 01/08/2016 01:53 AM
> Subject: Writing ORC files without HDFS
> ------------------------------
>
>
>
> Hi all,
>
> I am working on a project that requires me to write ORC files locally on a
> non-HDFS location. I was wondering if there is any project doing something
> similar, but I guess there is none, after spending some time on Google.
>
> I think what needs to get done is to re-implement the ORC Writer, sort of
> similar to the following but leaving out Hadoop:
>
>
> *https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java*
> <https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java>
>
> Am I on the right track implementing this?
>
> Let me know if you have any suggestions or links in the subject.
>
> Thank you very much,
> Istvan
>
> --
> the sun shines for all
>
>
>
>
--
the sun shines for all
Re: Writing ORC files without HDFS
Posted by Ravi Tatapudi <ra...@in.ibm.com>.
Hello,
You can write ORC-files on local-filesystem, by getting the local
"FileSystem" object, as "FileSystem.getLocal(conf)". Pl. find below the
simple-example given below & see if it works for your requirement.
=================================================
public class orcw {
private static Configuration conf = new Configuration();
public static Writer writer ;
public static class OrcRow
{
int col1 ;
String col2 ;
String col3 ;
OrcRow(int a, String b, String c) {
this.col1 = a ;
this.col2 = b ;
this.col3 = c ;
}
}
public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
String path = "/tmp/orcfile1";
try {
conf = new Configuration();
FileSystem fs = FileSystem.getLocal(conf);
ObjectInspector ObjInspector =
ObjectInspectorFactory.getReflectionObjectInspector(OrcRow.class,
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
writer = OrcFile.createWriter(new Path(path),
OrcFile.writerOptions(conf).inspector(ObjInspector).stripeSize(100000).bufferSize(10000).compress(CompressionKind.ZLIB).version(OrcFile.Version.V_0_12));
writer.addRow(new OrcRow(1,"hello","orcFile")) ;
writer.addRow(new OrcRow(2,"hello2","orcFile2")) ;
writer.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
=================================================
Thanks,
Ravi
From: István <le...@gmail.com>
To: user@orc.apache.org
Date: 01/08/2016 01:53 AM
Subject: Writing ORC files without HDFS
Hi all,
I am working on a project that requires me to write ORC files locally on a
non-HDFS location. I was wondering if there is any project doing something
similar, but I guess there is none, after spending some time on Google.
I think what needs to get done is to re-implement the ORC Writer, sort of
similar to the following but leaving out Hadoop:
https://github.com/apache/hive/blob/master/orc/src/java/org/apache/orc/impl/WriterImpl.java
Am I on the right track implementing this?
Let me know if you have any suggestions or links in the subject.
Thank you very much,
Istvan
--
the sun shines for all