Wrong FS s3://ss-pprd-v2-dart//tempdir/962c6007-77c0-4294-b021-b9498e3d66ab/manifest.json -expected s3a://ss-pprd-v2-dart
NickName:Brajesh Kumar Mishra Ask DateTime:2022-03-10T20:29:40

Wrong FS s3://ss-pprd-v2-dart//tempdir/962c6007-77c0-4294-b021-b9498e3d66ab/manifest.json -expected s3a://ss-pprd-v2-dart

I am using spark 3.2.1, Java8 ->1.8.0_292 (AdoptOpenJDK), Scala 2.12.10 and trying to read and write the data from/to redshift by using below mentioned jars and packages. But i am not able write the data back. While writing the data back to redshift. it was creating avro files with one manifest.json file in the temp directory but in my current versions it is not able to create the manifest.json file but it is creating all the avro files.

Jars and Packages:-

RedshiftJDBC42-no-awssdk-1.2.54.1082.jar,
hadoop-aws-3.3.1.jar,aws-java-sdk-1.12.173.jar ,
org.apache.spark:spark-avro_2.12:3.2.1,
io.github.spark-redshift-community:spark-redshift_2.12:5.0.3,
com.eclipsesource.minimal-json:minimal-json:0.9.5

Code i am trying to run:

from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession

conf=SparkConf().setAppName("Testing")
sc=SparkContext.getOrCreate(conf)
sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", AWS_ACCESS_KEY)
sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key", AWS_SECRET_KEY)

df.write \
        .format("io.github.spark_redshift_community.spark.redshift")\
        .option("url", REDSHIFT_JDBC_URL) \
        .option("dbtable",MASTER_TABLE) \
        .option("forward_spark_s3_credentials", "true") \
        .option("extracopyoptions", EXTRACOPYOPTIONS) \
        .option("tempdir", "s3a://" + str(S3_BUCKET) + "/tempdir") \
        .mode("append") \
        .save()

print("Sucesss")

Stack Trace:

Traceback (most recent call last):
  File "/Users/brajeshmishra/Documents/TEMP/Temp_Py.py", line 65, in <module>
    .mode("append") \
  File "/opt/homebrew/Cellar/apache-spark/3.2.1/libexec/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 738, in save
  File "/opt/homebrew/Cellar/apache-spark/3.2.1/libexec/python/lib/py4j-0.10.9.3-src.zip/py4j/java_gateway.py", line 1322, in __call__
  File "/opt/homebrew/Cellar/apache-

List item

spark/3.2.1/libexec/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
pyspark.sql.utils.IllegalArgumentException: Wrong FS s3://ss-pprd-v2-dart//tempdir/962c6007-77c0-4294-b021-b9498e3d66ab/manifest.json -expected s3a://ss-pprd-v2-dart

Copyright Notice:Content Author:「Brajesh Kumar Mishra」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/71424185/wrong-fs-s3-ss-pprd-v2-dart-tempdir-962c6007-77c0-4294-b021-b9498e3d66ab-mani

More about “Wrong FS s3://ss-pprd-v2-dart//tempdir/962c6007-77c0-4294-b021-b9498e3d66ab/manifest.json -expected s3a://ss-pprd-v2-dart” related questions

Wrong FS s3://ss-pprd-v2-dart//tempdir/962c6007-77c0-4294-b021-b9498e3d66ab/manifest.json -expected s3a://ss-pprd-v2-dart

I am using spark 3.2.1, Java8 -&gt;1.8.0_292 (AdoptOpenJDK), Scala 2.12.10 and trying to read and write the data from/to redshift by using below mentioned jars and packages. But i am not able write...

Show Detail

Sqoop + S3 + Parquet results in Wrong FS error

When trying to import data to S3 in Parquet format using Sqoop, as follows: bin/sqoop import --connect 'jdbc:[conn_string]' --table [table] --target-dir s3a://bucket-name/ --hive-drop-import-delim...

Show Detail

HDFS application to S3 using S3a connector

I'm trying to understand the capability of S3a connector here for the use case where I have to run my current HDFS based application over a S3 storage without much change to the application. On quick

Show Detail

hadoop fs -ls s3://bucket or s3a://bucket throws "No such file or directory" error

In a newly created EMR cluster, using: hdfs dfs -ls s3://bucket hadoop fs -ls s3://bucket hadoop fs -ls s3a:// etc. ...all return the error: &quot;ls: `s3://bucket': No such file or directory&quo...

Show Detail

How to move files from one S3 bucket directory to another directory in same bucket? Scala/Java

I want to move all files under a directory in my s3 bucket to another directory within the same bucket, using scala. Here is what I have: def copyFromInputFilesToArchive(spark: SparkSession) : Unit...

Show Detail

Parquet using s3a in Spark 1.6

I need to access Parquet files on S3 from Spark 1.6. My attempted approach is at the bottom, and I get a "403 Forbidden" error, which is the same error as if the keys are invalid or missing.

Show Detail

IllegalArgumentException: Unrecognized scheme null; expected s3, s3n, or s3a

I want to Write the dynamic frame into my redshift, but i got error that says Unrecognized scheme null; expected s3, s3n, or s3a i have tried like this below. sc = SparkContext() glueContext =

Show Detail

Can you translate (or alias) s3:// to s3a:// in Spark/ Hadoop?

We have some code that we run on Amazon's servers that loads parquet using the s3:// scheme as advised by Amazon. However, some developers want to run code locally using a spark installation on Win...

Show Detail

Can't read from S3 bucket with s3 protocol, s3a only

I've been through all the threads on the dependencies for connecting spark running on an aws EMR to an s3 bucket, however my issue seems to be slightly different. In all of the other discussions I ...

Show Detail

How do I get Hive 2.2.1 to successfully integrate with AWS S3 using "s3a://" scheme

I've followed various published documentation on integrating Apache Hive 2.1.1 with AWS S3 using the s3a:// scheme, configuring fs.s3a.access.key and fs.s3a.secret.key for hadoop/etc/hadoop/core-s...

Show Detail