Pig UDF to write to a file to HDFS
NickName:Dheeraj R Ask DateTime:2013-10-21T02:33:17

Pig UDF to write to a file to HDFS

I would like to read a complete file through a Pig UDF and then prepare an output file using PrintWriter library in Java and store it on HDFS.

Is this possible,

Steps followed

1) I am able to read the input file in the UDF. Prepare a HashMap from that file.[ACHIEVED]

2) write the data to an output file by filtering the input file. The filtration is done using the HashMap[YET TO BE ACHIEVED]

Can anyone help in my step2.

Aim is to create a file in the Pig UDF and write to that file.

Thanks,

Regards, Dheeraj Rampally.

Copyright Notice:Content Author:「Dheeraj R」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/19481227/pig-udf-to-write-to-a-file-to-hdfs

More about “Pig UDF to write to a file to HDFS” related questions

Pig UDF to write to a file to HDFS

I would like to read a complete file through a Pig UDF and then prepare an output file using PrintWriter library in Java and store it on HDFS. Is this possible, Steps followed 1) I am able to re...

Show Detail

Apache Pig load entire relationship into UDF

I have a pig script that pertains to 2 Pig relations, lets say A and B. A is a small relationship, and B is a big one. My UDF should load all of A into memory on each machine and then use it while

Show Detail

Pig script cannot register UDF

I have a simple Pig script that uses a Python UDF I have created. The script completes fine if I remove the UDF portion. But when I try to register my UDF I get the following error: ERROR 2997:

Show Detail

Facing issue running Pig in local mode, failing in java udf

I have a Pig UDF (written in Java) which reads data from a JSON file present in HDFS and does further calculation. Below is the line of code (last line in the snippet) which is giving error. Becau...

Show Detail

Issue in executing Pig UDF

I wrote a UDF in pig where I want to pass all the tuple to the UDF function to filter . REGISTER /home/ec2-user/FilterColumnsUDF-0.0.1-SNAPSHOT.jar DEFINE SampleFilterUDF SampleFilterUDF() A = l...

Show Detail

Using UDF on Avro file in PIg script

I'm importing an avro file on HDFS to HBase using Pig, but I have to apply a user defined function (UDF) to the row id. I'm using the SHA function from Apache DataFU register datafu-pig-incubating...

Show Detail

Error while using python udf in Pig

I am trying to use python udf but it is throwing below error. I am using CDH5.2 cat /home/spanda20/pig_data/panda1.py def get_length(data): return len(data) REGISTER '/home/spanda20/pig_data/...

Show Detail

PIG UDF throwing error

I am getting an error in PIG script. PIG SCRIPT : REGISTER /var/lib/hadoop-hdfs/udf.jar; REGISTER /var/lib/hadoop-hdfs/udf2.jar; INPUT_LINES = Load 'hdfs:/Inputdata/

Show Detail

HDFS path in user defined function of Pig Latin

I have the following User Defined Function programmed in Java language: I defined FileWriter but an error message appear after the execution. The program: outputFile = new FileWriter("hdfs://Nae...

Show Detail

Check existence of a field in HDFS avro format using Pig/Python

I have a set of files in HDFS stored in Avro format. Some of them have a column named id:int as follows { "type" : "record", "name" : "metric", "fields" : [ { "name&

Show Detail