You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Nandakumar (JIRA)" <ji...@apache.org> on 2016/10/02 19:38:20 UTC

[jira] [Created] (FLUME-3001) Flume Twitter data Streaming issue

Nandakumar created FLUME-3001:
---------------------------------

             Summary: Flume Twitter data Streaming issue
                 Key: FLUME-3001
                 URL: https://issues.apache.org/jira/browse/FLUME-3001
             Project: Flume
          Issue Type: Bug
          Components: Build, Configuration, Test
    Affects Versions: v1.5.2
         Environment: Personal - Home Network
            Reporter: Nandakumar
            Priority: Blocker
             Fix For: v1.5.2


Hi Team,
I am getting the  below error message while streaming data from twitter.

cloud you please help me to fix this issue. please....

    Error Message:

Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] 404:The URI requested is invalid or the resource requested, such as a user, does not exist.
Unknown URL. See Twitter Streaming API documentation at http://dev.twitter.com/pages/streaming_api

    Flume conf file details:

#Naming the components on the current agent.
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS

#Describing/Configuring the source
TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = ******
TwitterAgent.sources.Twitter.consumerSecret = *******
TwitterAgent.sources.Twitter.accessToken = *******
TwitterAgent.sources.Twitter.accessTokenSecret = **********
TwitterAgent.sources.Twitter.keywords = docker,intel

#Describing/Configuring the sink
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://quickstart.cloudera:8020/user/Flume/twitter_data/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000

#Describing/Configuring the channel
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100

    Twitter files which i used with version 3.0.3:

twitter4j-core-3.0.3
twitter4j-media-support-3.0.3
twitter4j-stream-3.0.3

    Below are the Steps i have followed to streaming the twitter data:


STEP 1:
Login in as root user
sudo -u root -i

STEP 2:
Install Flume
yum install flume-ng
yum install flume-ng-agent

STEP 3:
DOWNLOAD jar
files.cloudera.com/samples/flume-sources-1.0-SNAPSHOT.jar

STEP 4:
create enviroment file
cd /etc/flume-ng/conf/
ls -l
cp /etc/flume-ng/conf/flume-env.sh.template /etc/flume-ng/conf/flume-env.sh

STEP 5:
open the flume-env.sh file and set the downloaded jar filename in the classpath at the end by using below command
nano /etc/flume-ng/conf/flume-env.sh
cat /etc/flume-ng/conf/flume-env.sh

STEP 6:
configure the twitter agent conf file by using below command
nano /etc/flume-ng/conf/flume.conf
cat /etc/flume-ng/conf/flume.conf

STEP 7:
Rename the twitter jar files to .org
cd /usr/lib/flume-ng/lib
ls -l

mv /usr/lib/flume-ng/lib/twitter4j-core-3.0.3.jar /usr/lib/flume-ng/lib/twitter4j-core-3.0.3.org
mv /usr/lib/flume-ng/lib/twitter4j-media-support-3.0.3.jar /usr/lib/flume-ng/lib/twitter4j-media-support-3.0.3.org
mv /usr/lib/flume-ng/lib/twitter4j-stream-3.0.3.jar /usr/lib/flume-ng/lib/twitter4j-stream-3.0.3.org

STEP 8:
Then Run the script
/usr/bin/flume-ng agent --conf /etc/flume-ng/conf/ -f /etc/flume-ng/conf/flume.conf -n TwitterAgent

STEP 9:
check logs
cat /var/log/flume-ng/flume.log

Thanks,
Nanda




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)