Unable to get the expected reduced output using Mapreduce in Hadoop
NickName:Prabhat Kumar Ask DateTime:2016-01-22T06:56:40

Unable to get the expected reduced output using Mapreduce in Hadoop

I am trying to learn MapReduce and doing this task.

My input is as below(State, Sport, Amount(in USD)):

California Football 69.09 California Swimming 31.5 Illinois Golf 8.31 Illinois Tennis 15.75 Oklahoma Golf 15.44 Oklahoma Tennis 8.33 Texas Golf 16.71 Texas Swimming 71.59 Washington Football 50.32000000000001

And I am expecting my output such that the output should display which sport is popular in the particular state (depending on the highest sales of sport items). For eg:

California Football 69.09 Illinois Tennis 15.75 Oklahoma Golf 15.44 and so on

Below are my Mapper, Reducer and Driver codes:

Mapper code:

package org.assignment.sports;

import java.io.IOException;


import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class Sports_Mapper2 extends Mapper<LongWritable, Text, Text, Text>{
    public void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException{
        String[] s= value.toString().split(" ");
        String Sport_State = s[0];
        String other = s[1]+" "+s[2];
        context.write(new Text(Sport_State), new Text(other));
    }
}

Reducer Code:

package org.assignment.sports;

import java.io.IOException;

import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;


public class Sports_Reducer2 extends Reducer<Text, Text, Text, DoubleWritable>{

    private static double MAX=0.00;
    public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException
    {
        //String[] k= values.toString().split(" ");
        for (Text value:values){
            String[] k= value.toString().split(" ");
            DoubleWritable price = new DoubleWritable(Double.parseDouble(k[1]));
        if(price.get()>MAX){
            MAX = price.get();
        }
        else{
            continue;
        }
        String ss = key.toString()+" "+ k[0];
        context.write(new Text(ss), new DoubleWritable(MAX));
        }               
        }

}

Driver Code:

package org.assignment.sports;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class Sports_Driver2 {
    public static void main(String[] args) throws Exception
    {
        Configuration conf = new Configuration();

        Job job = new Job(conf, "Sports_Driver2");

        String[] otherArgs =new GenericOptionsParser(conf, args).getRemainingArgs();

        job.setJarByClass(Sports_Driver2.class);
        job.setMapperClass(Sports_Mapper2.class);
        job.setReducerClass(Sports_Reducer2.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(Text.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(DoubleWritable.class);

        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job,new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true)? 0: 1);
    }

}

I am getting the output as below:

California Football 69.09 Texas Swimming 71.59

Where am I going wrong? Any help is appreciated

Copyright Notice:Content Author:「Prabhat Kumar」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/34935962/unable-to-get-the-expected-reduced-output-using-mapreduce-in-hadoop

Answers
byteherder 2016-01-22T00:43:33

The problem is that the MAX value in the Reducer is not being reset after each particular state is written. \n\nString ss = key.toString()+\" \"+ k[0];\ncontext.write(new Text(ss), new DoubleWritable(MAX));\nMAX = 0.00;\n",


More about “Unable to get the expected reduced output using Mapreduce in Hadoop” related questions

Unable to get the expected reduced output using Mapreduce in Hadoop

I am trying to learn MapReduce and doing this task. My input is as below(State, Sport, Amount(in USD)): California Football 69.09 California Swimming 31.5 Illinois Golf 8.31 Illinois Tennis 15.75

Show Detail

Hadoop MapReduce NoSuchElementException

I wanted to run a MapReduce-Job on my FreeBSD-Cluster with two nodes but I get the following Exception 14/08/27 14:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your

Show Detail

Handling Error: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

I am using CDH4 and have written a MapReduce application using the new mapreduce API. I have compiled it against hadoop-core-1.0.3.jar and when I run it on my Hadoop cluster I get the error: Error:

Show Detail

Big data and hadoop exception when running a Map reduced programme

I got the WordCount.java code from the internet and I tried to run it in eclipse after including the necessary libraries. But the code throws this exception: 2015-05-27 17:48:24,759 WARN util.

Show Detail

Big data and hadoop exception when running a Map reduced programme

I got the WordCount.java code from the internet and I tried to run it in eclipse after including the necessary libraries. But the code throws this exception: 2015-05-27 17:48:24,759 WARN util.

Show Detail

Hadoop MapReduce Output for Maximum

I am currently using Eclipse and Hadoop to create a mapper and reducer to find Maximum Total Cost of an Airline Data Set. So the Total Cost is Decimal Value and Airline Carrier is Text. The datas...

Show Detail

Hadoop MapReduce sort reduce output using the key

down below there is a map-reduce program counting words of several text files. My aim is to have the result in a descending order regarding the amount of appearences. Unfortunately the program sor...

Show Detail

Mapreduce hadoop error with Gradle

The error which I am getting is as : 16/02/10 11:21:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/02/10 11...

Show Detail

MapReduce in Hadoop 2.2.0 not working

After installing and configuring my Hadoop 2.2.0 in pseudo-distributed mode everything is running, as you can see in the jps: $ jps 2287 JobHistoryServer 1926 ResourceManager 2162 NodeManager 1834

Show Detail

Unable to run Hadoop MapReduce Word Count on Hadoop 2.6

I have set up Hadoop 2.6.0 on 2 Vm's each running CentOS 6.5(64 bit) ,Yarn as well as Hadoop is running fine. My host m/c is Windows 8.1 (64 bit) I am trying to run Word Count map reduce problem from

Show Detail