Java with HDFS file read/write
NickName:Jinith Ask DateTime:2015-08-12T13:22:16

Java with HDFS file read/write

I am new to Hadoop and Java. I have to read and write to a *.txt file stored on HDFS in my remote cloud-era distribution. And for the same I have this small java program written:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class ReadHadoopFileData {
    public static void main(String[] args) throws IOException, URISyntaxException {

        Configuration configuration = new Configuration();
        FileSystem hdfs = FileSystem.get( new URI( "hdfs://admin:[email protected]:8888" ), configuration );
        Path file = new Path("hdfs://admin:[email protected]:8888/user/admin/Data/Tlog.txt");

        try{
            BufferedReader br=new BufferedReader(new InputStreamReader(hdfs.open(file)));
            String line;
            line=br.readLine();
            while (line != null){
                System.out.println(line);
                line=br.readLine();
            }
        }catch(Exception e){
            e.printStackTrace();
        }
    }
}

But when the row BufferedReader br=new BufferedReader(new InputStreamReader(hdfs.open(file))); is executed I am running into this error:

java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "KWTLT02221/169.254.208.16"; destination host is: "172.16.104.124":8888; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.getBlockLocations(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1220)
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1210)
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1200)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:271)
    at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:238)
    at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:231)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1498)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
    at ReadHadoopFileData.main(ReadHadoopFileData.java:26)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.
    at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFrom(UnknownFieldSet.java:461)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFrom(UnknownFieldSet.java:579)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFrom(UnknownFieldSet.java:280)
    at com.google.protobuf.CodedInputStream.readGroup(CodedInputStream.java:240)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:488)
    at com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2207)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2165)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2295)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2290)
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
    at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:3167)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1072)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

Could someone help me out to get this resolved please ? I am on this for a day now.

Copyright Notice:Content Author:「Jinith」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/31956623/java-with-hdfs-file-read-write

More about “Java with HDFS file read/write” related questions

Java with HDFS file read/write

I am new to Hadoop and Java. I have to read and write to a *.txt file stored on HDFS in my remote cloud-era distribution. And for the same I have this small java program written: import java.io.

Show Detail

HDFS buffered write/read operations

I am using the HDFS Java API and FSDataOutput and FSDataInput streams to write/read files to a Hadoop 2.6.0 cluster of 4 machines. The FS stream implementations have a bufferSize constructor para...

Show Detail

Read and Write Parquet and HDFS files from java

currently I can read and write HDFS files in java, but I don't know how to read apache parquet files besides hdfs, my idea is to be able to read and write both files in java package com.leerhdfs; //

Show Detail

Sparkr Read/Write with HDFS

I am trying to figure out how to read and write arbitrary files to/from HDFS in SparkR. Set up is: args &lt;- commandArgs(trailingOnly = T) MASTER &lt;- args[1] SPARK_HOME &lt;- args[2] INPATH &l...

Show Detail

Write a file on hdfs with permissions enabled

I try to write a file on hdfs using Java (v 1.8) with permissions enabled. As hadoop instance I have used ready docker image : https://hub.docker.com/r/sequenceiq/hadoop-docker/ I have followed W...

Show Detail

Hadoop HDFS: Read/Write parallelism?

Couldn't find enough information on internet so asking here: Assuming I'm writing a huge file to disk, hundreds of Terabytes, which is a result of mapreduce (or spark or whatever). How would mapre...

Show Detail

Is it possible to read and write Parquet using Java without a dependency on Hadoop and HDFS?

I've been hunting around for a solution to this question. It appears to me that there is no way to embed reading and writing Parquet format in a Java program without pulling in dependencies on HDF...

Show Detail

Pig UDF to write to a file to HDFS

I would like to read a complete file through a Pig UDF and then prepare an output file using PrintWriter library in Java and store it on HDFS. Is this possible, Steps followed 1) I am able to re...

Show Detail

Write a file in hdfs with Java

I want to create a file in HDFS and write data in that. I used this code: Configuration config = new Configuration(); FileSystem fs = FileSystem.get(config); Path filenamePath = new Path("in...

Show Detail

Read HDFS file splits

With HDFS's Java API, it's straightforward to read a file sequentially reading each block at a time. Here's a simple example. I want to be able to read the file one block at a time using something...

Show Detail