You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Alex Lee <el...@hotmail.com> on 2014/05/07 04:48:11 UTC

Realtime sensor's tcpip data to hadoop

Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high.
 
Firstly, how the data from tcpip can be put into hadoop. It need to do some process and store in hbase. Does it need through save to data files and put into hadoop or can be done in some direct ways from tcpip. Is there any software module can take care of this. Searched that Ganglia Nagios and Flume may do it. But when looking into details, ganglia and nagios are more for monitoring hadoop cluster itself. Flume is for log files.
 
Secondly, if the total network traffic from sensors are over the limit of one lan port, how to share the loads, is there any component in hadoop to make this done automatically.
 
Any suggestions, thanks.
 		 	   		  

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Whether you use Storm/kafka or any other realtime processing or not, you
may still need to persist the data which can be done directly to hbase from
any of these realtime system or from the source.


On Thu, May 8, 2014 at 9:25 PM, Hardik Pandya <sm...@gmail.com>wrote:

> If I were you I would ask following questions to get the answer
>
> > forget about for a minute and ask yourself how tcpip data are currently
> being stored - in fs/rdbmbs?
> > hadoop is for offiline batch processing - if you are looking for real
> time streaming solution - there is a storm (from linkedin) that can go well
> with kafka (messaging queue) or spark streaming (which is in memory
> map-reduce) and takes real time streams - has in built twitter api but you
> need to write your own service to poll data every few seconds and send it
> in RDD format
> > storm is complementary to hadoop - spark in conjuction with hadoop will
> allow you to do both offline and real time data analytics
>
>
>
>
> On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Whether you use Storm/kafka or any other realtime processing or not, you
may still need to persist the data which can be done directly to hbase from
any of these realtime system or from the source.


On Thu, May 8, 2014 at 9:25 PM, Hardik Pandya <sm...@gmail.com>wrote:

> If I were you I would ask following questions to get the answer
>
> > forget about for a minute and ask yourself how tcpip data are currently
> being stored - in fs/rdbmbs?
> > hadoop is for offiline batch processing - if you are looking for real
> time streaming solution - there is a storm (from linkedin) that can go well
> with kafka (messaging queue) or spark streaming (which is in memory
> map-reduce) and takes real time streams - has in built twitter api but you
> need to write your own service to poll data every few seconds and send it
> in RDD format
> > storm is complementary to hadoop - spark in conjuction with hadoop will
> allow you to do both offline and real time data analytics
>
>
>
>
> On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Whether you use Storm/kafka or any other realtime processing or not, you
may still need to persist the data which can be done directly to hbase from
any of these realtime system or from the source.


On Thu, May 8, 2014 at 9:25 PM, Hardik Pandya <sm...@gmail.com>wrote:

> If I were you I would ask following questions to get the answer
>
> > forget about for a minute and ask yourself how tcpip data are currently
> being stored - in fs/rdbmbs?
> > hadoop is for offiline batch processing - if you are looking for real
> time streaming solution - there is a storm (from linkedin) that can go well
> with kafka (messaging queue) or spark streaming (which is in memory
> map-reduce) and takes real time streams - has in built twitter api but you
> need to write your own service to poll data every few seconds and send it
> in RDD format
> > storm is complementary to hadoop - spark in conjuction with hadoop will
> allow you to do both offline and real time data analytics
>
>
>
>
> On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Whether you use Storm/kafka or any other realtime processing or not, you
may still need to persist the data which can be done directly to hbase from
any of these realtime system or from the source.


On Thu, May 8, 2014 at 9:25 PM, Hardik Pandya <sm...@gmail.com>wrote:

> If I were you I would ask following questions to get the answer
>
> > forget about for a minute and ask yourself how tcpip data are currently
> being stored - in fs/rdbmbs?
> > hadoop is for offiline batch processing - if you are looking for real
> time streaming solution - there is a storm (from linkedin) that can go well
> with kafka (messaging queue) or spark streaming (which is in memory
> map-reduce) and takes real time streams - has in built twitter api but you
> need to write your own service to poll data every few seconds and send it
> in RDD format
> > storm is complementary to hadoop - spark in conjuction with hadoop will
> allow you to do both offline and real time data analytics
>
>
>
>
> On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Hardik Pandya <sm...@gmail.com>.
If I were you I would ask following questions to get the answer

> forget about for a minute and ask yourself how tcpip data are currently
being stored - in fs/rdbmbs?
> hadoop is for offiline batch processing - if you are looking for real
time streaming solution - there is a storm (from linkedin) that can go well
with kafka (messaging queue) or spark streaming (which is in memory
map-reduce) and takes real time streams - has in built twitter api but you
need to write your own service to poll data every few seconds and send it
in RDD format
> storm is complementary to hadoop - spark in conjuction with hadoop will
allow you to do both offline and real time data analytics




On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

fuse_dfs coredump

Posted by "tdhkx@126.com" <td...@126.com>.
Hi All,

I've use fuse_dfs to mount HDFS to local, and it's successed. I can use linux commands.
But when i started pure-ftp on it. It's core dumped.

Run Infos:
[root@datanode2 fuse-dfs]# ./fuse_dfs_wrapper.sh rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg /home/3183/ 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:115 Ignoring option -d 
FUSE library version: 2.9.3 
nullpath_ok: 0 
nopath: 0 
utime_omit_ok: 0 
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0 
INIT: 7.13 
flags=0x0000007b 
max_readahead=0x00020000 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c:98 Mounting with options: [ protected=(NULL), nn_uri=192.168.1.2, nn_port=53310, debug=0, read_only=0, initchecks=0, no_permissions=0, usetrash=0, entry_timeout=60, attribute_timeout=60, rdbuffer_size=10485760, direct_io=0 ] 
fuseConnectInit: initialized with timer period 5, expiry period 300 
INIT: 7.19 
flags=0x00000039 
max_readahead=0x00020000 
max_write=0x00020000 
max_background=0 
congestion_threshold=0 
unique: 1, success, outsize: 40 
unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 1305 
getattr / 
unique: 2, success, outsize: 120 
unique: 3, opcode: LOOKUP (1), nodeid: 1, insize: 48, pid: 1305 
LOOKUP /.banner 
getattr /.banner 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x00007fc5811c6116, pid=1221, tid=140486250882816 
# 
# JRE version: 6.0_24-b07 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# C [libc.so.6+0x7f116] 
# 
# An error report file with more information is saved as: 
# /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/hs_err_pid1221.log 
# 
# If you would like to submit a bug report, please visit: 
# http://java.sun.com/webapps/bugreport/crash.jsp 
# 
./fuse_dfs_wrapper.sh: line 46: 1221 (core dumped) ./fuse_dfs $@

Backtrace Infos:
[root@datanode2 fuse-dfs]# gdb fuse_dfs core 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-redhat-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/fuse_dfs...done. 
[New Thread 1224] 
[New Thread 1226] 
[New Thread 1228] 
[New Thread 1227] 
[New Thread 1241] 
[New Thread 1229] 
[New Thread 1242] 
[New Thread 1221] 
[New Thread 1230] 
[New Thread 1225] 
[New Thread 1243] 
[New Thread 1235] 
[New Thread 1239] 
[New Thread 1311] 
[New Thread 1236] 
[New Thread 1303] 
[New Thread 1310] 
[New Thread 1240] 
Missing separate debuginfo for /usr/local/lib/libfuse.so.2 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/2a/e3dd5ab8947a34e176447deea9b49ee038ee3e 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/fc/7378e225b143f127e67f4d933bb3d3022c0279 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/b8/b57bfdef378f69cf66754e0fc91f18109e5378 
Missing separate debuginfo for 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/71/e48146fa2dcac5d4d82a1ef860d535627aa1b9 
Reading symbols from /usr/local/lib/libfuse.so.2...done. 
Loaded symbols for /usr/local/lib/libfuse.so.2 
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/librt.so.1 
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libdl.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libm.so.6 
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. 
[Thread debugging using libthread_db enabled] 
Loaded symbols for /lib64/libpthread.so.0 
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libc.so.6 
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/ld-linux-x86-64.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so 
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnsl.so.1 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so 
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnss_files.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so 
Reading symbols from /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so 
Core was generated by `./fuse_dfs rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d'. 
Program terminated with signal 6, Aborted. 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 
(gdb) bt 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
#1 0x00007fc58117b085 in abort () from /lib64/libc.so.6 
#2 0x00007fc5821c6fd7 in os::abort(bool) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#3 0x00007fc58231a05d in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#4 0x00007fc58231aba1 in crash_handler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#5 <signal handler called> 
#6 0x00007fc5821c45a6 in os::is_first_C_frame(frame*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#7 0x00007fc582318fda in VMError::report(outputStream*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#8 0x00007fc582319f75 in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#9 0x00007fc5821cd655 in JVM_handle_linux_signal () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#10 0x00007fc5821c9bae in signalHandler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#11 <signal handler called> 
#12 0x00007fc5811c6116 in __strcmp_sse2 () from /lib64/libc.so.6 
#13 0x0000000000403cdd in hdfsConnCompare (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:203 
#14 hdfsConnTree_RB_FIND (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:81 
#15 0x00000000004041e5 in hdfsConnFind (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:219 
#16 fuseConnect (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:516 
#17 0x00000000004042d7 in fuseConnectAsThreadUid (conn=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:543 
#18 0x0000000000404c05 in dfs_getattr (path=0x7fc57cbd5810 "/.banner", st=0x7fc581145bd0) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c:37 
#19 0x00007fc582aac393 in lookup_path (f=0x9a46d0, nodeid=1, name=0x7fc582ec3038 ".banner", path=<value optimized out>, 
e=0x7fc581145bc0, fi=<value optimized out>) at fuse.c:2470 
---Type <return> to continue, or q <return> to quit--- 
#20 0x00007fc582aae856 in fuse_lib_lookup (req=0x7fc57cb4a050, parent=1, name=0x7fc582ec3038 ".banner") at fuse.c:2669 
#21 0x00007fc582ab669d in fuse_ll_process_buf (data=0x9a49c0, buf=0x7fc581145e60, ch=0x9a4cd8) at fuse_lowlevel.c:2441 
#22 0x00007fc582ab2fea in fuse_do_work (data=0x9a6830) at fuse_loop_mt.c:117 
#23 0x00007fc5814e1851 in start_thread () from /lib64/libpthread.so.0 
#24 0x00007fc58122f11d in clone () from /lib64/libc.so.6 
(gdb) quit

Any suggestions, thanks.


tdhkx@126.com

fuse_dfs coredump

Posted by "tdhkx@126.com" <td...@126.com>.
Hi All,

I've use fuse_dfs to mount HDFS to local, and it's successed. I can use linux commands.
But when i started pure-ftp on it. It's core dumped.

Run Infos:
[root@datanode2 fuse-dfs]# ./fuse_dfs_wrapper.sh rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg /home/3183/ 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:115 Ignoring option -d 
FUSE library version: 2.9.3 
nullpath_ok: 0 
nopath: 0 
utime_omit_ok: 0 
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0 
INIT: 7.13 
flags=0x0000007b 
max_readahead=0x00020000 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c:98 Mounting with options: [ protected=(NULL), nn_uri=192.168.1.2, nn_port=53310, debug=0, read_only=0, initchecks=0, no_permissions=0, usetrash=0, entry_timeout=60, attribute_timeout=60, rdbuffer_size=10485760, direct_io=0 ] 
fuseConnectInit: initialized with timer period 5, expiry period 300 
INIT: 7.19 
flags=0x00000039 
max_readahead=0x00020000 
max_write=0x00020000 
max_background=0 
congestion_threshold=0 
unique: 1, success, outsize: 40 
unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 1305 
getattr / 
unique: 2, success, outsize: 120 
unique: 3, opcode: LOOKUP (1), nodeid: 1, insize: 48, pid: 1305 
LOOKUP /.banner 
getattr /.banner 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x00007fc5811c6116, pid=1221, tid=140486250882816 
# 
# JRE version: 6.0_24-b07 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# C [libc.so.6+0x7f116] 
# 
# An error report file with more information is saved as: 
# /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/hs_err_pid1221.log 
# 
# If you would like to submit a bug report, please visit: 
# http://java.sun.com/webapps/bugreport/crash.jsp 
# 
./fuse_dfs_wrapper.sh: line 46: 1221 (core dumped) ./fuse_dfs $@

Backtrace Infos:
[root@datanode2 fuse-dfs]# gdb fuse_dfs core 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-redhat-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/fuse_dfs...done. 
[New Thread 1224] 
[New Thread 1226] 
[New Thread 1228] 
[New Thread 1227] 
[New Thread 1241] 
[New Thread 1229] 
[New Thread 1242] 
[New Thread 1221] 
[New Thread 1230] 
[New Thread 1225] 
[New Thread 1243] 
[New Thread 1235] 
[New Thread 1239] 
[New Thread 1311] 
[New Thread 1236] 
[New Thread 1303] 
[New Thread 1310] 
[New Thread 1240] 
Missing separate debuginfo for /usr/local/lib/libfuse.so.2 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/2a/e3dd5ab8947a34e176447deea9b49ee038ee3e 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/fc/7378e225b143f127e67f4d933bb3d3022c0279 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/b8/b57bfdef378f69cf66754e0fc91f18109e5378 
Missing separate debuginfo for 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/71/e48146fa2dcac5d4d82a1ef860d535627aa1b9 
Reading symbols from /usr/local/lib/libfuse.so.2...done. 
Loaded symbols for /usr/local/lib/libfuse.so.2 
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/librt.so.1 
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libdl.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libm.so.6 
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. 
[Thread debugging using libthread_db enabled] 
Loaded symbols for /lib64/libpthread.so.0 
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libc.so.6 
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/ld-linux-x86-64.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so 
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnsl.so.1 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so 
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnss_files.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so 
Reading symbols from /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so 
Core was generated by `./fuse_dfs rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d'. 
Program terminated with signal 6, Aborted. 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 
(gdb) bt 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
#1 0x00007fc58117b085 in abort () from /lib64/libc.so.6 
#2 0x00007fc5821c6fd7 in os::abort(bool) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#3 0x00007fc58231a05d in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#4 0x00007fc58231aba1 in crash_handler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#5 <signal handler called> 
#6 0x00007fc5821c45a6 in os::is_first_C_frame(frame*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#7 0x00007fc582318fda in VMError::report(outputStream*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#8 0x00007fc582319f75 in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#9 0x00007fc5821cd655 in JVM_handle_linux_signal () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#10 0x00007fc5821c9bae in signalHandler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#11 <signal handler called> 
#12 0x00007fc5811c6116 in __strcmp_sse2 () from /lib64/libc.so.6 
#13 0x0000000000403cdd in hdfsConnCompare (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:203 
#14 hdfsConnTree_RB_FIND (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:81 
#15 0x00000000004041e5 in hdfsConnFind (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:219 
#16 fuseConnect (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:516 
#17 0x00000000004042d7 in fuseConnectAsThreadUid (conn=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:543 
#18 0x0000000000404c05 in dfs_getattr (path=0x7fc57cbd5810 "/.banner", st=0x7fc581145bd0) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c:37 
#19 0x00007fc582aac393 in lookup_path (f=0x9a46d0, nodeid=1, name=0x7fc582ec3038 ".banner", path=<value optimized out>, 
e=0x7fc581145bc0, fi=<value optimized out>) at fuse.c:2470 
---Type <return> to continue, or q <return> to quit--- 
#20 0x00007fc582aae856 in fuse_lib_lookup (req=0x7fc57cb4a050, parent=1, name=0x7fc582ec3038 ".banner") at fuse.c:2669 
#21 0x00007fc582ab669d in fuse_ll_process_buf (data=0x9a49c0, buf=0x7fc581145e60, ch=0x9a4cd8) at fuse_lowlevel.c:2441 
#22 0x00007fc582ab2fea in fuse_do_work (data=0x9a6830) at fuse_loop_mt.c:117 
#23 0x00007fc5814e1851 in start_thread () from /lib64/libpthread.so.0 
#24 0x00007fc58122f11d in clone () from /lib64/libc.so.6 
(gdb) quit

Any suggestions, thanks.


tdhkx@126.com

fuse_dfs coredump

Posted by "tdhkx@126.com" <td...@126.com>.
Hi All,

I've use fuse_dfs to mount HDFS to local, and it's successed. I can use linux commands.
But when i started pure-ftp on it. It's core dumped.

Run Infos:
[root@datanode2 fuse-dfs]# ./fuse_dfs_wrapper.sh rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg /home/3183/ 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:115 Ignoring option -d 
FUSE library version: 2.9.3 
nullpath_ok: 0 
nopath: 0 
utime_omit_ok: 0 
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0 
INIT: 7.13 
flags=0x0000007b 
max_readahead=0x00020000 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c:98 Mounting with options: [ protected=(NULL), nn_uri=192.168.1.2, nn_port=53310, debug=0, read_only=0, initchecks=0, no_permissions=0, usetrash=0, entry_timeout=60, attribute_timeout=60, rdbuffer_size=10485760, direct_io=0 ] 
fuseConnectInit: initialized with timer period 5, expiry period 300 
INIT: 7.19 
flags=0x00000039 
max_readahead=0x00020000 
max_write=0x00020000 
max_background=0 
congestion_threshold=0 
unique: 1, success, outsize: 40 
unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 1305 
getattr / 
unique: 2, success, outsize: 120 
unique: 3, opcode: LOOKUP (1), nodeid: 1, insize: 48, pid: 1305 
LOOKUP /.banner 
getattr /.banner 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x00007fc5811c6116, pid=1221, tid=140486250882816 
# 
# JRE version: 6.0_24-b07 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# C [libc.so.6+0x7f116] 
# 
# An error report file with more information is saved as: 
# /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/hs_err_pid1221.log 
# 
# If you would like to submit a bug report, please visit: 
# http://java.sun.com/webapps/bugreport/crash.jsp 
# 
./fuse_dfs_wrapper.sh: line 46: 1221 (core dumped) ./fuse_dfs $@

Backtrace Infos:
[root@datanode2 fuse-dfs]# gdb fuse_dfs core 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-redhat-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/fuse_dfs...done. 
[New Thread 1224] 
[New Thread 1226] 
[New Thread 1228] 
[New Thread 1227] 
[New Thread 1241] 
[New Thread 1229] 
[New Thread 1242] 
[New Thread 1221] 
[New Thread 1230] 
[New Thread 1225] 
[New Thread 1243] 
[New Thread 1235] 
[New Thread 1239] 
[New Thread 1311] 
[New Thread 1236] 
[New Thread 1303] 
[New Thread 1310] 
[New Thread 1240] 
Missing separate debuginfo for /usr/local/lib/libfuse.so.2 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/2a/e3dd5ab8947a34e176447deea9b49ee038ee3e 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/fc/7378e225b143f127e67f4d933bb3d3022c0279 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/b8/b57bfdef378f69cf66754e0fc91f18109e5378 
Missing separate debuginfo for 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/71/e48146fa2dcac5d4d82a1ef860d535627aa1b9 
Reading symbols from /usr/local/lib/libfuse.so.2...done. 
Loaded symbols for /usr/local/lib/libfuse.so.2 
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/librt.so.1 
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libdl.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libm.so.6 
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. 
[Thread debugging using libthread_db enabled] 
Loaded symbols for /lib64/libpthread.so.0 
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libc.so.6 
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/ld-linux-x86-64.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so 
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnsl.so.1 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so 
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnss_files.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so 
Reading symbols from /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so 
Core was generated by `./fuse_dfs rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d'. 
Program terminated with signal 6, Aborted. 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 
(gdb) bt 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
#1 0x00007fc58117b085 in abort () from /lib64/libc.so.6 
#2 0x00007fc5821c6fd7 in os::abort(bool) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#3 0x00007fc58231a05d in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#4 0x00007fc58231aba1 in crash_handler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#5 <signal handler called> 
#6 0x00007fc5821c45a6 in os::is_first_C_frame(frame*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#7 0x00007fc582318fda in VMError::report(outputStream*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#8 0x00007fc582319f75 in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#9 0x00007fc5821cd655 in JVM_handle_linux_signal () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#10 0x00007fc5821c9bae in signalHandler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#11 <signal handler called> 
#12 0x00007fc5811c6116 in __strcmp_sse2 () from /lib64/libc.so.6 
#13 0x0000000000403cdd in hdfsConnCompare (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:203 
#14 hdfsConnTree_RB_FIND (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:81 
#15 0x00000000004041e5 in hdfsConnFind (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:219 
#16 fuseConnect (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:516 
#17 0x00000000004042d7 in fuseConnectAsThreadUid (conn=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:543 
#18 0x0000000000404c05 in dfs_getattr (path=0x7fc57cbd5810 "/.banner", st=0x7fc581145bd0) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c:37 
#19 0x00007fc582aac393 in lookup_path (f=0x9a46d0, nodeid=1, name=0x7fc582ec3038 ".banner", path=<value optimized out>, 
e=0x7fc581145bc0, fi=<value optimized out>) at fuse.c:2470 
---Type <return> to continue, or q <return> to quit--- 
#20 0x00007fc582aae856 in fuse_lib_lookup (req=0x7fc57cb4a050, parent=1, name=0x7fc582ec3038 ".banner") at fuse.c:2669 
#21 0x00007fc582ab669d in fuse_ll_process_buf (data=0x9a49c0, buf=0x7fc581145e60, ch=0x9a4cd8) at fuse_lowlevel.c:2441 
#22 0x00007fc582ab2fea in fuse_do_work (data=0x9a6830) at fuse_loop_mt.c:117 
#23 0x00007fc5814e1851 in start_thread () from /lib64/libpthread.so.0 
#24 0x00007fc58122f11d in clone () from /lib64/libc.so.6 
(gdb) quit

Any suggestions, thanks.


tdhkx@126.com

fuse_dfs coredump

Posted by "tdhkx@126.com" <td...@126.com>.
Hi All,

I've use fuse_dfs to mount HDFS to local, and it's successed. I can use linux commands.
But when i started pure-ftp on it. It's core dumped.

Run Infos:
[root@datanode2 fuse-dfs]# ./fuse_dfs_wrapper.sh rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:164 Adding FUSE arg /home/3183/ 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c:115 Ignoring option -d 
FUSE library version: 2.9.3 
nullpath_ok: 0 
nopath: 0 
utime_omit_ok: 0 
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0 
INIT: 7.13 
flags=0x0000007b 
max_readahead=0x00020000 
INFO /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c:98 Mounting with options: [ protected=(NULL), nn_uri=192.168.1.2, nn_port=53310, debug=0, read_only=0, initchecks=0, no_permissions=0, usetrash=0, entry_timeout=60, attribute_timeout=60, rdbuffer_size=10485760, direct_io=0 ] 
fuseConnectInit: initialized with timer period 5, expiry period 300 
INIT: 7.19 
flags=0x00000039 
max_readahead=0x00020000 
max_write=0x00020000 
max_background=0 
congestion_threshold=0 
unique: 1, success, outsize: 40 
unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 1305 
getattr / 
unique: 2, success, outsize: 120 
unique: 3, opcode: LOOKUP (1), nodeid: 1, insize: 48, pid: 1305 
LOOKUP /.banner 
getattr /.banner 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x00007fc5811c6116, pid=1221, tid=140486250882816 
# 
# JRE version: 6.0_24-b07 
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops) 
# Problematic frame: 
# C [libc.so.6+0x7f116] 
# 
# An error report file with more information is saved as: 
# /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/hs_err_pid1221.log 
# 
# If you would like to submit a bug report, please visit: 
# http://java.sun.com/webapps/bugreport/crash.jsp 
# 
./fuse_dfs_wrapper.sh: line 46: 1221 (core dumped) ./fuse_dfs $@

Backtrace Infos:
[root@datanode2 fuse-dfs]# gdb fuse_dfs core 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-redhat-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/target/native/main/native/fuse-dfs/fuse_dfs...done. 
[New Thread 1224] 
[New Thread 1226] 
[New Thread 1228] 
[New Thread 1227] 
[New Thread 1241] 
[New Thread 1229] 
[New Thread 1242] 
[New Thread 1221] 
[New Thread 1230] 
[New Thread 1225] 
[New Thread 1243] 
[New Thread 1235] 
[New Thread 1239] 
[New Thread 1311] 
[New Thread 1236] 
[New Thread 1303] 
[New Thread 1310] 
[New Thread 1240] 
Missing separate debuginfo for /usr/local/lib/libfuse.so.2 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/2a/e3dd5ab8947a34e176447deea9b49ee038ee3e 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/fc/7378e225b143f127e67f4d933bb3d3022c0279 
Missing separate debuginfo for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/b8/b57bfdef378f69cf66754e0fc91f18109e5378 
Missing separate debuginfo for 
Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/71/e48146fa2dcac5d4d82a1ef860d535627aa1b9 
Reading symbols from /usr/local/lib/libfuse.so.2...done. 
Loaded symbols for /usr/local/lib/libfuse.so.2 
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/librt.so.1 
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libdl.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
Reading symbols from /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native/libhdfs.so.0.0.0 
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libm.so.6 
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done. 
[Thread debugging using libthread_db enabled] 
Loaded symbols for /lib64/libpthread.so.0 
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libc.so.6 
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/ld-linux-x86-64.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libverify.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjava.so 
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnsl.so.1 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/native_threads/libhpi.so 
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done. 
Loaded symbols for /lib64/libnss_files.so.2 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libzip.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libmanagement.so 
Reading symbols from /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0...done. 
Loaded symbols for /home/hkx/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libjaas_unix.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnet.so 
Reading symbols from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so...(no debugging symbols found)...done. 
Loaded symbols for /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/libnio.so 
Core was generated by `./fuse_dfs rw -oserver=192.168.1.2 -oport=53310 /home/3183/ -d'. 
Program terminated with signal 6, Aborted. 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.6.x86_64 
(gdb) bt 
#0 0x00007fc5811798a5 in raise () from /lib64/libc.so.6 
#1 0x00007fc58117b085 in abort () from /lib64/libc.so.6 
#2 0x00007fc5821c6fd7 in os::abort(bool) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#3 0x00007fc58231a05d in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#4 0x00007fc58231aba1 in crash_handler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#5 <signal handler called> 
#6 0x00007fc5821c45a6 in os::is_first_C_frame(frame*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#7 0x00007fc582318fda in VMError::report(outputStream*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#8 0x00007fc582319f75 in VMError::report_and_die() () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#9 0x00007fc5821cd655 in JVM_handle_linux_signal () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#10 0x00007fc5821c9bae in signalHandler(int, siginfo*, void*) () from /home/yjx/JDK/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so 
#11 <signal handler called> 
#12 0x00007fc5811c6116 in __strcmp_sse2 () from /lib64/libc.so.6 
#13 0x0000000000403cdd in hdfsConnCompare (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:203 
#14 hdfsConnTree_RB_FIND (head=<value optimized out>, elm=<value optimized out>) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:81 
#15 0x00000000004041e5 in hdfsConnFind (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:219 
#16 fuseConnect (usrname=0x0, ctx=0x7fc57c000ac0, out=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:516 
#17 0x00000000004042d7 in fuseConnectAsThreadUid (conn=0x7fc581145b00) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c:543 
#18 0x0000000000404c05 in dfs_getattr (path=0x7fc57cbd5810 "/.banner", st=0x7fc581145bd0) 
at /home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c:37 
#19 0x00007fc582aac393 in lookup_path (f=0x9a46d0, nodeid=1, name=0x7fc582ec3038 ".banner", path=<value optimized out>, 
e=0x7fc581145bc0, fi=<value optimized out>) at fuse.c:2470 
---Type <return> to continue, or q <return> to quit--- 
#20 0x00007fc582aae856 in fuse_lib_lookup (req=0x7fc57cb4a050, parent=1, name=0x7fc582ec3038 ".banner") at fuse.c:2669 
#21 0x00007fc582ab669d in fuse_ll_process_buf (data=0x9a49c0, buf=0x7fc581145e60, ch=0x9a4cd8) at fuse_lowlevel.c:2441 
#22 0x00007fc582ab2fea in fuse_do_work (data=0x9a6830) at fuse_loop_mt.c:117 
#23 0x00007fc5814e1851 in start_thread () from /lib64/libpthread.so.0 
#24 0x00007fc58122f11d in clone () from /lib64/libc.so.6 
(gdb) quit

Any suggestions, thanks.


tdhkx@126.com

Re: Realtime sensor's tcpip data to hadoop

Posted by alex kamil <al...@gmail.com>.
or you can use combination of kafka <http://kafka.apache.org/> +
phoenix<http://phoenix.incubator.apache.org/>


On Wed, May 7, 2014 at 8:55 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Alex,
>
> you can try Apache Flume.
>
>
> On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by alex kamil <al...@gmail.com>.
or you can use combination of kafka <http://kafka.apache.org/> +
phoenix<http://phoenix.incubator.apache.org/>


On Wed, May 7, 2014 at 8:55 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Alex,
>
> you can try Apache Flume.
>
>
> On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by alex kamil <al...@gmail.com>.
or you can use combination of kafka <http://kafka.apache.org/> +
phoenix<http://phoenix.incubator.apache.org/>


On Wed, May 7, 2014 at 8:55 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Alex,
>
> you can try Apache Flume.
>
>
> On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by alex kamil <al...@gmail.com>.
or you can use combination of kafka <http://kafka.apache.org/> +
phoenix<http://phoenix.incubator.apache.org/>


On Wed, May 7, 2014 at 8:55 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Alex,
>
> you can try Apache Flume.
>
>
> On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:
>
>> Sensors' may send tcpip data to server. Each sensor may send tcpip data
>> like a stream to the server, the quatity of the sensors and the data rate
>> of the data is high.
>>
>> Firstly, how the data from tcpip can be put into hadoop. It need to do
>> some process and store in hbase. Does it need through save to data files
>> and put into hadoop or can be done in some direct ways from tcpip. Is there
>> any software module can take care of this. Searched that Ganglia Nagios and
>> Flume may do it. But when looking into details, ganglia and nagios are
>> more for monitoring hadoop cluster itself. Flume is for log files.
>>
>> Secondly, if the total network traffic from sensors are over the limit of
>> one lan port, how to share the loads, is there any component in hadoop to
>> make this done automatically.
>>
>> Any suggestions, thanks.
>>
>
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Azuryy Yu <az...@gmail.com>.
Hi Alex,

you can try Apache Flume.


On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Hardik Pandya <sm...@gmail.com>.
If I were you I would ask following questions to get the answer

> forget about for a minute and ask yourself how tcpip data are currently
being stored - in fs/rdbmbs?
> hadoop is for offiline batch processing - if you are looking for real
time streaming solution - there is a storm (from linkedin) that can go well
with kafka (messaging queue) or spark streaming (which is in memory
map-reduce) and takes real time streams - has in built twitter api but you
need to write your own service to poll data every few seconds and send it
in RDD format
> storm is complementary to hadoop - spark in conjuction with hadoop will
allow you to do both offline and real time data analytics




On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Raj K Singh <ra...@gmail.com>.
I will suggest that don't pipe the sensor data to the HDFS directly instead
you can have some program(either java,python etc) on the server itself to
process the incoming sensor data and writing it to the text/binary
file(don't know the data format which you are currently receiving).now you
can put your data file on the HDFS alternatively you can directly process
the data and save to your HBASE managed on the HDFS.

if your sensor data is log data then you can use flume to load that data
into the HDFS directly.

Thanks

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Harsh J <ha...@cloudera.com>.
Apache Flume isn't just 'for log files' - it is an event collection
framework and would fit your use-case.

For further questions over Flume, please ask on user@flume.apache.org.

On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:
> Sensors' may send tcpip data to server. Each sensor may send tcpip data like
> a stream to the server, the quatity of the sensors and the data rate of the
> data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do some
> process and store in hbase. Does it need through save to data files and put
> into hadoop or can be done in some direct ways from tcpip. Is there any
> software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are more
> for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.



-- 
Harsh J

Re: Realtime sensor's tcpip data to hadoop

Posted by Raj K Singh <ra...@gmail.com>.
I will suggest that don't pipe the sensor data to the HDFS directly instead
you can have some program(either java,python etc) on the server itself to
process the incoming sensor data and writing it to the text/binary
file(don't know the data format which you are currently receiving).now you
can put your data file on the HDFS alternatively you can directly process
the data and save to your HBASE managed on the HDFS.

if your sensor data is log data then you can use flume to load that data
into the HDFS directly.

Thanks

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Raj K Singh <ra...@gmail.com>.
I will suggest that don't pipe the sensor data to the HDFS directly instead
you can have some program(either java,python etc) on the server itself to
process the incoming sensor data and writing it to the text/binary
file(don't know the data format which you are currently receiving).now you
can put your data file on the HDFS alternatively you can directly process
the data and save to your HBASE managed on the HDFS.

if your sensor data is log data then you can use flume to load that data
into the HDFS directly.

Thanks

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

RE: Realtime sensor's tcpip data to hadoop

Posted by Alex Lee <el...@hotmail.com>.
There are so many choice 
 OpenTSDB, flume, ActiveMQ, Lustre, Splunk, 
Ganglia or Nagios. Not very sure which one is more fiitted.
 
Date: Fri, 9 May 2014 15:46:24 +0200
Subject: Re: Realtime sensor's tcpip data to hadoop
From: dechouxb@gmail.com
To: user@hadoop.apache.org; user@flume.apache.org

Flume is indeed something you should look into. 'Log files' is a simplification. Flume really handles events and yes logs are a common kind of event but not the only one.
Regards

Bertrand
On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:




Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high.
 
Firstly, how the data from tcpip can be put into hadoop. It need to do some process and store in hbase. Does it need through save to data files and put into hadoop or can be done in some direct ways from tcpip. Is there any software module can take care of this. Searched that Ganglia Nagios and Flume may do it. But when looking into details, ganglia and nagios are more for monitoring hadoop cluster itself. Flume is for log files.

 
Secondly, if the total network traffic from sensors are over the limit of one lan port, how to share the loads, is there any component in hadoop to make this done automatically.
 
Any suggestions, thanks.
 		 	   		  


 		 	   		  

RE: Realtime sensor's tcpip data to hadoop

Posted by Alex Lee <el...@hotmail.com>.
There are so many choice 
 OpenTSDB, flume, ActiveMQ, Lustre, Splunk, 
Ganglia or Nagios. Not very sure which one is more fiitted.
 
Date: Fri, 9 May 2014 15:46:24 +0200
Subject: Re: Realtime sensor's tcpip data to hadoop
From: dechouxb@gmail.com
To: user@hadoop.apache.org; user@flume.apache.org

Flume is indeed something you should look into. 'Log files' is a simplification. Flume really handles events and yes logs are a common kind of event but not the only one.
Regards

Bertrand
On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:




Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high.
 
Firstly, how the data from tcpip can be put into hadoop. It need to do some process and store in hbase. Does it need through save to data files and put into hadoop or can be done in some direct ways from tcpip. Is there any software module can take care of this. Searched that Ganglia Nagios and Flume may do it. But when looking into details, ganglia and nagios are more for monitoring hadoop cluster itself. Flume is for log files.

 
Secondly, if the total network traffic from sensors are over the limit of one lan port, how to share the loads, is there any component in hadoop to make this done automatically.
 
Any suggestions, thanks.
 		 	   		  


 		 	   		  

RE: Realtime sensor's tcpip data to hadoop

Posted by Alex Lee <el...@hotmail.com>.
There are so many choice 
 OpenTSDB, flume, ActiveMQ, Lustre, Splunk, 
Ganglia or Nagios. Not very sure which one is more fiitted.
 
Date: Fri, 9 May 2014 15:46:24 +0200
Subject: Re: Realtime sensor's tcpip data to hadoop
From: dechouxb@gmail.com
To: user@hadoop.apache.org; user@flume.apache.org

Flume is indeed something you should look into. 'Log files' is a simplification. Flume really handles events and yes logs are a common kind of event but not the only one.
Regards

Bertrand
On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:




Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high.
 
Firstly, how the data from tcpip can be put into hadoop. It need to do some process and store in hbase. Does it need through save to data files and put into hadoop or can be done in some direct ways from tcpip. Is there any software module can take care of this. Searched that Ganglia Nagios and Flume may do it. But when looking into details, ganglia and nagios are more for monitoring hadoop cluster itself. Flume is for log files.

 
Secondly, if the total network traffic from sensors are over the limit of one lan port, how to share the loads, is there any component in hadoop to make this done automatically.
 
Any suggestions, thanks.
 		 	   		  


 		 	   		  

RE: Realtime sensor's tcpip data to hadoop

Posted by Alex Lee <el...@hotmail.com>.
There are so many choice 
 OpenTSDB, flume, ActiveMQ, Lustre, Splunk, 
Ganglia or Nagios. Not very sure which one is more fiitted.
 
Date: Fri, 9 May 2014 15:46:24 +0200
Subject: Re: Realtime sensor's tcpip data to hadoop
From: dechouxb@gmail.com
To: user@hadoop.apache.org; user@flume.apache.org

Flume is indeed something you should look into. 'Log files' is a simplification. Flume really handles events and yes logs are a common kind of event but not the only one.
Regards

Bertrand
On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:




Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high.
 
Firstly, how the data from tcpip can be put into hadoop. It need to do some process and store in hbase. Does it need through save to data files and put into hadoop or can be done in some direct ways from tcpip. Is there any software module can take care of this. Searched that Ganglia Nagios and Flume may do it. But when looking into details, ganglia and nagios are more for monitoring hadoop cluster itself. Flume is for log files.

 
Secondly, if the total network traffic from sensors are over the limit of one lan port, how to share the loads, is there any component in hadoop to make this done automatically.
 
Any suggestions, thanks.
 		 	   		  


 		 	   		  

Re: Realtime sensor's tcpip data to hadoop

Posted by Bertrand Dechoux <de...@gmail.com>.
Flume is indeed something you should look into. 'Log files' is a
simplification. Flume really handles events and yes logs are a common kind
of event but not the only one.

Regards

Bertrand

On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by sudhakara st <su...@gmail.com>.
Use the Flume.


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>



-- 

Regards,
...sudhakara

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Flume is not just for log files, you can wire up Flume's source for this
purpose. Also there are alternative open-source solutions for data
streaming, e.g. Apache Storm or Kafka.


On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Harsh J <ha...@cloudera.com>.
Apache Flume isn't just 'for log files' - it is an event collection
framework and would fit your use-case.

For further questions over Flume, please ask on user@flume.apache.org.

On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:
> Sensors' may send tcpip data to server. Each sensor may send tcpip data like
> a stream to the server, the quatity of the sensors and the data rate of the
> data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do some
> process and store in hbase. Does it need through save to data files and put
> into hadoop or can be done in some direct ways from tcpip. Is there any
> software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are more
> for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.



-- 
Harsh J

Re: Realtime sensor's tcpip data to hadoop

Posted by Azuryy Yu <az...@gmail.com>.
Hi Alex,

you can try Apache Flume.


On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Azuryy Yu <az...@gmail.com>.
Hi Alex,

you can try Apache Flume.


On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Bertrand Dechoux <de...@gmail.com>.
Flume is indeed something you should look into. 'Log files' is a
simplification. Flume really handles events and yes logs are a common kind
of event but not the only one.

Regards

Bertrand

On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by sudhakara st <su...@gmail.com>.
Use the Flume.


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>



-- 

Regards,
...sudhakara

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Flume is not just for log files, you can wire up Flume's source for this
purpose. Also there are alternative open-source solutions for data
streaming, e.g. Apache Storm or Kafka.


On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by sudhakara st <su...@gmail.com>.
Use the Flume.


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>



-- 

Regards,
...sudhakara

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Flume is not just for log files, you can wire up Flume's source for this
purpose. Also there are alternative open-source solutions for data
streaming, e.g. Apache Storm or Kafka.


On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Peyman Mohajerian <mo...@gmail.com>.
Flume is not just for log files, you can wire up Flume's source for this
purpose. Also there are alternative open-source solutions for data
streaming, e.g. Apache Storm or Kafka.


On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Harsh J <ha...@cloudera.com>.
Apache Flume isn't just 'for log files' - it is an event collection
framework and would fit your use-case.

For further questions over Flume, please ask on user@flume.apache.org.

On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:
> Sensors' may send tcpip data to server. Each sensor may send tcpip data like
> a stream to the server, the quatity of the sensors and the data rate of the
> data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do some
> process and store in hbase. Does it need through save to data files and put
> into hadoop or can be done in some direct ways from tcpip. Is there any
> software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are more
> for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.



-- 
Harsh J

Re: Realtime sensor's tcpip data to hadoop

Posted by Raj K Singh <ra...@gmail.com>.
I will suggest that don't pipe the sensor data to the HDFS directly instead
you can have some program(either java,python etc) on the server itself to
process the incoming sensor data and writing it to the text/binary
file(don't know the data format which you are currently receiving).now you
can put your data file on the HDFS alternatively you can directly process
the data and save to your HBASE managed on the HDFS.

if your sensor data is log data then you can use flume to load that data
into the HDFS directly.

Thanks

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Bertrand Dechoux <de...@gmail.com>.
Flume is indeed something you should look into. 'Log files' is a
simplification. Flume really handles events and yes logs are a common kind
of event but not the only one.

Regards

Bertrand

On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Hardik Pandya <sm...@gmail.com>.
If I were you I would ask following questions to get the answer

> forget about for a minute and ask yourself how tcpip data are currently
being stored - in fs/rdbmbs?
> hadoop is for offiline batch processing - if you are looking for real
time streaming solution - there is a storm (from linkedin) that can go well
with kafka (messaging queue) or spark streaming (which is in memory
map-reduce) and takes real time streams - has in built twitter api but you
need to write your own service to poll data every few seconds and send it
in RDD format
> storm is complementary to hadoop - spark in conjuction with hadoop will
allow you to do both offline and real time data analytics




On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Bertrand Dechoux <de...@gmail.com>.
Flume is indeed something you should look into. 'Log files' is a
simplification. Flume really handles events and yes logs are a common kind
of event but not the only one.

Regards

Bertrand

On Wed, May 7, 2014 at 4:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Harsh J <ha...@cloudera.com>.
Apache Flume isn't just 'for log files' - it is an event collection
framework and would fit your use-case.

For further questions over Flume, please ask on user@flume.apache.org.

On Wed, May 7, 2014 at 8:18 AM, Alex Lee <el...@hotmail.com> wrote:
> Sensors' may send tcpip data to server. Each sensor may send tcpip data like
> a stream to the server, the quatity of the sensors and the data rate of the
> data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do some
> process and store in hbase. Does it need through save to data files and put
> into hadoop or can be done in some direct ways from tcpip. Is there any
> software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are more
> for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.



-- 
Harsh J

Re: Realtime sensor's tcpip data to hadoop

Posted by Hardik Pandya <sm...@gmail.com>.
If I were you I would ask following questions to get the answer

> forget about for a minute and ask yourself how tcpip data are currently
being stored - in fs/rdbmbs?
> hadoop is for offiline batch processing - if you are looking for real
time streaming solution - there is a storm (from linkedin) that can go well
with kafka (messaging queue) or spark streaming (which is in memory
map-reduce) and takes real time streams - has in built twitter api but you
need to write your own service to poll data every few seconds and send it
in RDD format
> storm is complementary to hadoop - spark in conjuction with hadoop will
allow you to do both offline and real time data analytics




On Tue, May 6, 2014 at 10:48 PM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>

Re: Realtime sensor's tcpip data to hadoop

Posted by Azuryy Yu <az...@gmail.com>.
Hi Alex,

you can try Apache Flume.


On Wed, May 7, 2014 at 10:48 AM, Alex Lee <el...@hotmail.com> wrote:

> Sensors' may send tcpip data to server. Each sensor may send tcpip data
> like a stream to the server, the quatity of the sensors and the data rate
> of the data is high.
>
> Firstly, how the data from tcpip can be put into hadoop. It need to do
> some process and store in hbase. Does it need through save to data files
> and put into hadoop or can be done in some direct ways from tcpip. Is there
> any software module can take care of this. Searched that Ganglia Nagios and
> Flume may do it. But when looking into details, ganglia and nagios are
> more for monitoring hadoop cluster itself. Flume is for log files.
>
> Secondly, if the total network traffic from sensors are over the limit of
> one lan port, how to share the loads, is there any component in hadoop to
> make this done automatically.
>
> Any suggestions, thanks.
>