databricks certified data engineer associate practice test

certified data engineer associate

Last exam update: Nov 16 ,2024
Page 1 out of 8. Viewing questions 1-10 out of 90

Question 1

A data organization leader is upset about the data analysis teams reports being different from the data engineering teams reports. The leader believes the siloed nature of their organizations data engineering and data analysis architectures is to blame.
Which of the following describes how a data lakehouse could alleviate this issue?

  • A. Both teams would autoscale their work as data size evolves
  • B. Both teams would use the same source of truth for their work
  • C. Both teams would reorganize to report to the same department
  • D. Both teams would be able to collaborate on projects in real-time
  • E. Both teams would respond more quickly to ad-hoc requests
Answer:

b

User Votes:
A
50%
B 3 votes
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 2

Which of the following tools is used by Auto Loader process data incrementally?

  • A. Checkpointing
  • B. Spark Structured Streaming
  • C. Data Explorer
  • D. Unity Catalog
  • E. Databricks SQL
Answer:

b

User Votes:
A
50%
B 4 votes
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 3

Which of the following is hosted completely in the control plane of the classic Databricks architecture?

  • A. Worker node
  • B. JDBC data source
  • C. Databricks web application
  • D. Databricks Filesystem
  • E. Driver node
Answer:

e

User Votes:
A
50%
B
50%
C 3 votes
50%
D
50%
E 1 votes
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 4

A data engineer is running code in a Databricks Repo that is cloned from a central Git repository. A colleague of the data engineer informs them that changes have been made and synced to the central Git repository. The data engineer now needs to sync their Databricks Repo to get the changes from the central Git repository.

Which of the following Git operations does the data engineer need to run to accomplish this task?

  • A. Merge
  • B. Push
  • C. Pull
  • D. Commit
  • E. Clone
Answer:

c

User Votes:
A
50%
B
50%
C 2 votes
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 5

A data engineer needs to create a table in Databricks using data from a CSV file at location /path/to/csv.

They run the following command:



Which of the following lines of code fills in the above blank to successfully complete the task?

  • A. None of these lines of code are needed to successfully complete the task
  • B. USING CSV
  • C. FROM CSV
  • D. USING DELTA
  • E. FROM "path/to/csv"
Answer:

b

User Votes:
A
50%
B 1 votes
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 6

Which of the following must be specified when creating a new Delta Live Tables pipeline?

  • A. A key-value pair configuration
  • B. The preferred DBU/hour cost
  • C. A path to cloud storage location for the written data
  • D. A location of a target database for the written data
  • E. At least one notebook library to be executed
Answer:

e

User Votes:
A
50%
B
50%
C
50%
D
50%
E 2 votes
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 7

A data engineer is working with two tables. Each of these tables is displayed below in its entirety.



The data engineer runs the following query to join these tables together:



Which of the following will be returned by the above query?

  • E. None
Answer:

d

User Votes:
E
50%
Discussions
vote your answer:
E
0 / 1000

Question 8

A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name.

They have the following incomplete code block:

____(fSELECT customer_id, spend FROM {table_name})

Which of the following can be used to fill in the blank to successfully complete the task?

  • A. spark.delta.sql
  • B. spark.delta.table
  • C. spark.table
  • D. dbutils.sql
  • E. spark.sql
Answer:

e

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 9

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

  • A. TRANSFORM
  • B. PIVOT
  • C. SUM
  • D. CONVERT
  • E. WHERE
Answer:

b

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000

Question 10

A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.
Which of the following data entities should the data engineer create?

  • A. Database
  • B. Function
  • C. View
  • D. Temporary view
  • E. Table
Answer:

c

User Votes:
A
50%
B
50%
C
50%
D
50%
E
50%
Discussions
vote your answer:
A
B
C
D
E
0 / 1000
To page 2