Q1. The code block shown below should return a copy of DataFrame transactionsDf with an added column cos. This column should have the values in column value converted to degrees and having the cosine of those converted values taken, rounded to two decimals. Choose the answer that correctly fills the blanks in the code block to accomplish this. Code block: transactionsDf.__1__(__2__, round(__3__(__4__(__5__)),2))
A.1. withColumn 2. col('cos') 3. cos 4. degrees 5. transactionsDf.value
B. 1. withColumnRenamed 2. 'cos' 3. cos 4. degrees 5. 'transactionsDf.value'
C. 1. withColumn 2. 'cos' 3. cos 4. degrees 5. transactionsDf.value
D. 1. withColumn 2. col('cos') 3. cos 4. degrees 5. col('value')
E. 1. withColumn 2. 'cos' 3. degrees 4. cos 5. col('value')
Correct Answer: C
Q2. The code block displayed below contains one or more errors. The code block should load parquet files at location filePath into a DataFrame, only loading those files that have been modified before 2029-03-20 05:44:46. Spark should enforce a schema according to the schema shown below. Find the error. Schema: 1. root 2. |-- itemId: integer (nullable = true) 3. |-- attributes: array (nullable = true) 4. | |-- element: string (containsNull = true) 5. |-- supplier: string (nullable = true) Code block: 1. schema = StructType([ 2. StructType("itemId", IntegerType(), True), 3. StructType("attributes", ArrayType(StringType(), True), True), 4. StructType("supplier", StringType(), True) 5. ]) 6. 7. spark.read.options("modifiedBefore", "2029-03-20T05:44:46").schema(schema).load(filePath)
A.The attributes array is specified incorrectly, Spark cannot identify the file format, and the syntax of the call to Spark's DataFrameReader is incorrect.
B. Columns in the schema definition use the wrong object type and the syntax of the call to Spark's DataFrameReader is incorrect.
C. The data type of the schema is incompatible with the schema() operator and the modification date threshold is specified incorrectly.
D. Columns in the schema definition use the wrong object type, the modification date threshold is specified incorrectly, and Spark cannot identify the file format.
E. Columns in the schema are unable to handle empty values and the modification date threshold is specified incorrectly.
Correct Answer: D
Q3. The code block shown below should return a copy of DataFrame transactionsDf with an added column cos. This column should have the values in column value converted to degrees and having the cosine of those converted values taken, rounded to two decimals. Choose the answer that correctly fills the blanks in the code block to accomplish this. Code block: transactionsDf.__1__(__2__, round(__3__(__4__(__5__)),2))
A.1. withColumn 2. col('cos') 3. cos 4. degrees 5. transactionsDf.value
B. 1. withColumnRenamed 2. 'cos' 3. cos 4. degrees 5. 'transactionsDf.value'
C. 1. withColumn 2. 'cos' 3. cos 4. degrees 5. transactionsDf.value
D. 1. withColumn 2. col('cos') 3. cos 4. degrees 5. col('value')
E. 1. withColumn 2. 'cos' 3. degrees 4. cos 5. col('value')
Correct Answer: C
Q4. Which of the following code blocks returns all unique values across all values in columns value and productId in DataFrame transactionsDf in a one-column DataFrame?
A.tranactionsDf.select('value').join(transactionsDf.select('productId'), col('value')==col('productId'), 'outer')
B. transactionsDf.select(col('value'), col('productId')).agg({'*': 'count'})
C. transactionsDf.select('value', 'productId').distinct()
D. transactionsDf.select('value').union(transactionsDf.select('productId')).distinct()
E. transactionsDf.agg({'value': 'collect_set', 'productId': 'collect_set'})
Correct Answer: D
$ 39
Reviews
There are no reviews yet.