Can someone please point me in the direction of any resources that will help me manually setup/configure Hadoop (1.0.4) on EC2. I agree that there are lots of resources for accomplishing this using tools, services etc. but what I'm looking for is some help figuring out what modifications to manually make to the conf/*.xml files for both slaves and master in order to get Hadoop working.
Right now, I have 5 ec2 instances running and all of them are capable of running hadoop jobs individually in psuedo-distributed mode. So, I need to turn one into the master and the rest into slaves by way of configuring the conf files, such that the slaves know where the namenode and jobtracker is and the master knows about all the slaves.
My understanding is that I will also have to configure the EC2 security group of the instances so that they can all talk to one another on the right port. I think I'm OK with this.
Can anyone help me out with the configuration part, or point me towards something that might help?
Copyright Notice:Content Author:「RTF」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/15931458/configuring-hadoop-manually-on-ec2