Hadoop using Flume to pull twitter tweets
NickName:pinkco12 Ask DateTime:2018-04-18T03:34:38

Hadoop using Flume to pull twitter tweets

I am getting a sink error when I try to fetch twitter tweets. I added the Twitter API configurations and created a directory in HDFS. I'm not sure what I am doing wrong. I am using Hadoop 2.0.0-cdh4.2.1

ERROR flume.SinkRunner: Unable to deliver event. 
ERROR hdfs.HDFSEventSink: process failed

Exception follows.

java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.

flume.conf

# Naming the components on the current agent. 
TwitterAgent.sources = twitter
TwitterAgent.channels = memoryChannel
TwitterAgent.sinks = HDFS
  
# Describing/Configuring the source 
TwitterAgent.sources.twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.twitter.consumerKey = Xxx
TwitterAgent.sources.twitter.consumerSecret = Xxx 
TwitterAgent.sources.twitter.accessToken = Xxx
TwitterAgent.sources.twitter.accessTokenSecret = Xxx
TwitterAgent.sources.twitter.maxBatchDurationMillis = 200 
TwitterAgent.sources.twitter.channels = memoryChannel
TwitterAgent.sources.twitter.keywords = lsu
  
TwitterAgent.channels.memoryChannel.type = memory
TwitterAgent.channels.memoryChannel.capacity = 10000
TwitterAgent.channels.memoryChannel.transactionCapacity = 1000
 
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.channel = memoryChannel
TwitterAgent.sinks.HDFS.hdfs.path = hdfs:/user/flume/tweets/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.sinks.HDFS.hdfs.useLocalTimeStamp = true

Copyright Notice:Content Author:「pinkco12」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/49886162/hadoop-using-flume-to-pull-twitter-tweets

More about “Hadoop using Flume to pull twitter tweets” related questions

Hadoop using Flume to pull twitter tweets

I am getting a sink error when I try to fetch twitter tweets. I added the Twitter API configurations and created a directory in HDFS. I'm not sure what I am doing wrong. I am using Hadoop 2.0.0-cdh...

Show Detail

Flume not accepting keywords for Twitter stream

a Hadoop neophyte here, using this tutorial: https://acadgild.com/blog/streaming-twitter-data-using-flume/ to capture tweets. Here is my flume.conf file: TwitterAgent.sources = Twitter TwitterAgent.

Show Detail

error in streaming twitter data to Hadoop using flume

I am using Hadoop-1.2.1 on Ubuntu 14.04 I am trying to stream data from twitter to HDFS by using Flume-1.6.0. I have downloaded flume-sources-1.0-SNAPSHOT.jar and included it in flume/lib folder. ...

Show Detail

Unknown files format of tweets from Flume

I'm trying to get tweets using Flume. I am working with cloudera I use the twitter source provided here Below is my configuration file: TwitterAgent.sources = Twitter TwitterAgent.channels =

Show Detail

Apache flume twitter agent not streaming data

I am trying to stream twitter feeds to hdfs and then use hive. But the first part, streaming data and loading to hdfs is not working and giving Null Pointer Exception. This is what I have tried. ...

Show Detail

Using FLUME to store data in Hadoop

I have followed all the steps for hadoop installation and Flume from tutorials. I am a naive in Big Data tools. I am getting the following errors. I dont understand, where the problem is? I have a...

Show Detail

multiple flume twitter agents

im learning hadoop, flume etc and one of the projects I started was sentiment analysis, which is OK but now im trying to expand by collecting multiple sets of data, this is my flume.conf:

Show Detail

Flume Twitter Streaming Issue

I'm using Flume 1.6.0-cdh5.9.1 to stream Tweets using Twitter source. The configuration file is below: TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS

Show Detail

Using flume how to stream tweets from twitter only in English language?

I am trying to stream tweets about Deadpool movie whose are in English and load it to my hdfs.I configure my flume.conf like this TwitterAgent.sources.Twitter.keywords=Deadpool but It stream all ...

Show Detail

Java Out of Memory exception in Ubuntu when using Flume/Hadoop

I'm getting an out of memory exception due to lack of Java heap space when I try and download tweets using Flume and pipe them into Hadoop. I have set the heap space currently to 4GB in the mapred...

Show Detail