When running Hadoop in EC2, I seem to have two options:
- A: Manage the cluster myself, using the EC2-specific shell scripts that come with Hadoop.
- B: Use Elastic MapReduce, and pay a little extra for the convenience.
I'm leaning towards B, but I'd appreciate some advice from people with more experience. Here are my questions:
- Are there any tasks that can be done with one of these methods but not the other?
- Are there other options besides these two that I'm overlooking?
- If I choose B, how easy would it be to go back to A? That is, what's the danger of vendor lock-in?
Copyright Notice:Content Author:「Mike Baranczak」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/4964885/recommendations-for-hadoop-on-ec2