Certified Data Engineer Associate Practice Exam, Free Latest Q&A, Page 1

Question 1

A data organization leader is upset about the data analysis teams reports being different from the data engineering teams reports. The leader believes the siloed nature of their organizations data engineering and data analysis architectures is to blame.
Which of the following describes how a data lakehouse could alleviate this issue?

A. Both teams would autoscale their work as data size evolves
B. Both teams would use the same source of truth for their work Most Votes
C. Both teams would reorganize to report to the same department
D. Both teams would be able to collaborate on projects in real-time
E. Both teams would respond more quickly to ad-hoc requests

Answer:

b

User Votes:

A 1 votes

50%

B 23 votes

50%

C 1 votes

50%

D

50%

E

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

srachakonda

3 months, 1 week ago

Both teams would use the same source of truth for their work

1

[email protected]

1 month, 2 weeks ago

Both teams would use the same source of truth for their work

0

Question 2

Which of the following tools is used by Auto Loader process data incrementally?

A. Checkpointing
B. Spark Structured Streaming Most Votes
C. Data Explorer
D. Unity Catalog
E. Databricks SQL

Answer:

b

User Votes:

A 6 votes

50%

B 17 votes

50%

C 1 votes

50%

D

50%

E 1 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

Question 3

Which of the following is hosted completely in the control plane of the classic Databricks architecture?

A. Worker node
B. JDBC data source
C. Databricks web application Most Votes
D. Databricks Filesystem
E. Driver node

Answer:

e

User Votes:

A

50%

B 2 votes

50%

C 18 votes

50%

D 1 votes

50%

E 4 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

C. Databricks web application

0

Question 4

A data engineer is running code in a Databricks Repo that is cloned from a central Git repository. A colleague of the data engineer informs them that changes have been made and synced to the central Git repository. The data engineer now needs to sync their Databricks Repo to get the changes from the central Git repository.

Which of the following Git operations does the data engineer need to run to accomplish this task?

A. Merge
B. Push
C. Pull Most Votes
D. Commit
E. Clone

Answer:

c

User Votes:

A 2 votes

50%

B

50%

C 16 votes

50%

D

50%

E

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

C. Pull Pull

0

Question 5

A data engineer needs to create a table in Databricks using data from a CSV file at location /path/to/csv.

They run the following command:

Which of the following lines of code fills in the above blank to successfully complete the task?

A. None of these lines of code are needed to successfully complete the task
B. USING CSV Most Votes
C. FROM CSV
D. USING DELTA
E. FROM "path/to/csv"

Answer:

b

User Votes:

A 2 votes

50%

B 12 votes

50%

C 1 votes

50%

D

50%

E 2 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

B. USING CSV

0

Question 6

Which of the following must be specified when creating a new Delta Live Tables pipeline?

A. A key-value pair configuration
B. The preferred DBU/hour cost
C. A path to cloud storage location for the written data
D. A location of a target database for the written data
E. At least one notebook library to be executed Most Votes

Answer:

e

User Votes:

A 1 votes

50%

B 2 votes

50%

C 1 votes

50%

D 4 votes

50%

E 11 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

D. A location of a target database for the written data

0

Question 7

A data engineer is working with two tables. Each of these tables is displayed below in its entirety.

The data engineer runs the following query to join these tables together:

Which of the following will be returned by the above query?

E. None

Answer:

d

User Votes:

E 6 votes

50%

Discussions

0 / 1000

Question 8

A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name.

They have the following incomplete code block:

____(fSELECT customer_id, spend FROM {table_name})

Which of the following can be used to fill in the blank to successfully complete the task?

A. spark.delta.sql
B. spark.delta.table
C. spark.table
D. dbutils.sql
E. spark.sql Most Votes

Answer:

e

User Votes:

A

50%

B

50%

C 2 votes

50%

D 2 votes

50%

E 10 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

E. spark.sql

0

Question 9

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

A. TRANSFORM
B. PIVOT Most Votes
C. SUM
D. CONVERT
E. WHERE

Answer:

b

User Votes:

A

50%

B 12 votes

50%

C

50%

D 1 votes

50%

E 1 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

B. PIVOT PIVOT

0

Question 10

A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.
Which of the following data entities should the data engineer create?

A. Database
B. Function
C. View
D. Temporary view
E. Table

Answer:

c

User Votes:

A

50%

B 1 votes

50%

C 8 votes

50%

D

50%

E 7 votes

50%

Discussions

vote your answer:

A

B

C

D

E

0 / 1000

[email protected]

1 month, 2 weeks ago

E. Table Table

0

databricks certified data engineer associate practice test