databricks certified associate developer for apache spark practice test

certified associate developer for apache spark

Last exam update: Oct 11 ,2024
Page 1 out of 10. Viewing questions 1-10 out of 102

Question 1

Which of the following code blocks returns a new DataFrame with column storeReview where the pattern End has been removed from the end of column storeReview in DataFrame storesDF?

A sample DataFrame storesDF is below:

  • A. storesDF.withColumn("storeReview", col("storeReview").regexp_replace(" End$", ""))
  • B. storesDF.withColumn("storeReview", regexp_replace(col("storeReview"), " End$", ""))
  • C. storesDF.withColumn("storeReview, regexp_replace(col("storeReview"), " End$"))
  • D. storesDF.withColumn("storeReview", regexp_replace("storeReview", " End$", ""))
  • E. storesDF.withColumn("storeReview", regexp_extract(col("storeReview"), " End$", ""))
Answer:

d

User Votes:
A
50%
B
50%
C
50%
D 1 votes
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 2

Which of the following DataFrame operations is classified as an action?

  • A. DataFrame.drop()
  • B. DataFrame.coalesce()
  • C. DataFrame.take()
  • D. DataFrame.join()
  • E. DataFrame.filter()
Answer:

c

User Votes:
A
50%
B
50%
C 1 votes
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 3

Which of the following operations can be used to return the number of rows in a DataFrame?

  • A. DataFrame.numberOfRows()
  • B. DataFrame.n()
  • C. DataFrame.sum()
  • D. DataFrame.count()
  • E. DataFrame.countDistinct()
Answer:

d

User Votes:
A
50%
B
50%
C
50%
D 1 votes
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 4

Which of the following code blocks returns a new DataFrame with a new column employeesPerSqft that is the quotient of column numberOfEmployees and column sqft, both of which are from DataFrame storesDF? Note that column employeesPerSqft is not in the original DataFrame storesDF.

  • A. storesDF.withColumn("employeesPerSqft", col("numberOfEmployees") / col("sqft"))
  • B. storesDF.withColumn("employeesPerSqft", "numberOfEmployees" / "sqft")
  • C. storesDF.select("employeesPerSqft", "numberOfEmployees" / "sqft")
  • D. storesDF.select("employeesPerSqft", col("numberOfEmployees") / col("sqft"))
  • E. storesDF.withColumn(col("employeesPerSqft"), col("numberOfEmployees") / col("sqft"))
Answer:

a

User Votes:
A 1 votes
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 5

The code block shown below contains an error. The code block is intended to return a new DataFrame that is the result of a cross join between DataFrame storesDF and DataFrame employeesDF. Identify the error.
Code block:
storesDF.join(employeesDF, cross)

  • A. A cross join is not implemented by the DataFrame.join() operations the standalone CrossJoin() operation should be used instead.
  • B. There is no direct cross join in Spark, but it can be implemented by performing an outer join on all columns of both DataFrames.
  • C. A cross join is not implemented by the DataFrame.join()operation the DataFrame.crossJoin()operation should be used instead.
  • D. There is no key column specified the key column "storeId" should be the second argument.
  • E. A cross join is not implemented by the DataFrame.join() operations the standalone join() operation should be used instead.
Answer:

c

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 6

Which of the following will occur if there are more slots than there are tasks?

  • A. The Spark job will likely not run as efficiently as possible.
  • B. The Spark application will fail there must be at least as many tasks as there are slots.
  • C. Some executors will shut down and allocate all slots on larger executors first.
  • D. More tasks will be automatically generated to ensure all slots are being used.
  • E. The Spark job will use just one single slot to perform all tasks.
Answer:

d

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 7

The code block shown below contains an error. The code block intended to return a new DataFrame that is the result of an inner join between DataFrame storesDF and DataFrame employeesDF on column storeId. Identify the error.

Code block:

StoresDF.join(employeesDF, Seq(storeId)

  • A. The key column storeId needs to be a string like storeId.
  • B. The key column storeId needs to be specified in an expression of both Data Frame columns like storesDF.storeId ===employeesDF.storeId.
  • C. The default argument to the joinType parameter is inner - an additional argument of left must be specified.
  • D. There is no DataFrame.join() operation - DataFrame.merge() should be used instead.
  • E. The key column storeId needs to be wrapped in the col() operation.
Answer:

c

User Votes:
A 1 votes
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 8

The code block shown below should read a CSV at the file path filePath into a DataFrame with the specified schema schema. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

__1__.__2__.__3__(__4__).format(csv).__5__(__6__)

  • A. 1. spark2. read()3. schema4. schema5. json6. filePath
  • B. 1. spark2. read()3. schema4. schema5. load6. filePath
  • C. 1. spark2. read3. format4. "json"5. load6. filePath
  • D. 1. spark2. read()3. json4. filePath5. format6. schema
  • E. 1. spark2. read3. schema4. schema5. load6. filePath
Answer:

b

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 9

Which of the following code blocks returns a new DataFrame where column division from DataFrame storesDF has been replaced and renamed to column state and column managerName from DataFrame storesDF has been replaced and renamed to column managerFullName?

  • A. storesDF.withColumnRenamed("division", "state").withColumnRenamed("managerName", "managerFullName")
  • B. storesDF.withColumn("state", "division").withColumn("managerFullName", "managerName")
  • C. storesDF.withColumn("state", col("division")).withColumn("managerFullName", col("managerName"))
  • D. storesDF.withColumnRenamed(Seq("division", "state"), Seq("managerName", "managerFullName"))
  • E. storesDF.withColumnRenamed("state", "division").withColumnRenamed("managerFullName", "managerName")
Answer:

a

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 10

The code block shown below should return a new DataFrame that is the result of an inner join between DataFrame storeDF and DataFrame employeesDF on column storeId. Choose the response chat correctly fills in the numbered blanks within the code block to complete this task.

Code block:

storesDF.__1__(__2__, __3__, __4__)

  • A. 1. join2. employeesDF3. "inner"4. storesDF.storeId === employeesDF.storeId
  • B. 1. join2. employeesDF3. "storeId"4. "inner"
  • C. 1. merge2. employeesDF3. "storeId"4. "inner"
  • E. 1. join2. employeesDF3. "inner"4. "storeId"
Answer:

d

User Votes:
A
50%
B 1 votes
50%
C
50%
E
50%
Discussions
vote your answer:
A
B
C
E
0 / 1000
To page 2