databricks certified machine learning professional practice test
certified machine learning professional
Last exam update: Jan 15 ,2025
Page 1 out of 5. Viewing questions 1-10 out of 57
Question 1
Which of the following MLflow Model Registry use cases requires the use of an HTTP Webhook?
A.
Starting a testing job when a new model is registered
B.
Updating data in a source table for a Databricks SQL dashboard when a model version transitions to the Production stage
C.
Sending an email alert when an automated testing Job fails
D.
None of these use cases require the use of an HTTP Webhook
E.
Sending a message to a Slack channel when a model version transitions stages
Answer:
b
User Votes:
A 1 votes
50%
B
50%
C
50%
D
50%
E 2 votes
50%
Discussions
0/ 1000
Question 2
A data scientist has created a Python function compute_features that returns a Spark DataFrame with the following schema:
The resulting DataFrame is assigned to the features_df variable. The data scientist wants to create a Feature Store table using features_df. Which of the following code blocks can they use to create and populate the Feature Store table using the Feature Store Client fs?
C.
features_df.write.mode("fs").path("new_table")
D.
None
E.
features_df.write.mode("feature").path("new_table")
Answer:
d
User Votes:
C
50%
D
50%
E
50%
Discussions
0/ 1000
Question 3
A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model. Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?
A.
Staging, Production, Archived
B.
Production
C.
None, Staging, Production, Archived
D.
Staging, Production
E.
None, Staging, Production
Answer:
d
User Votes:
A
50%
B
50%
C 1 votes
50%
D
50%
E
50%
Discussions
0/ 1000
Question 4
A machine learning engineering team has written predictions computed in a batch job to a Delta table for querying. However, the team has noticed that the querying is running slowly. The team has already tuned the size of the data files. Upon investigating, the team has concluded that the rows meeting the query condition are sparsely located throughout each of the data files. Based on the scenario, which of the following optimization techniques could speed up the query by colocating similar records while considering values in multiple columns?
A.
Z-Ordering
B.
Bin-packing
C.
Write as a Parquet file
D.
Data skipping
E.
Tuning the file size
Answer:
e
User Votes:
A 1 votes
50%
B
50%
C 1 votes
50%
D
50%
E
50%
Discussions
0/ 1000
Question 5
Which of the following is an advantage of using the python_function(pyfunc) model flavor over the built-in library-specific model flavors?
A.
python_function provides no benefits over the built-in library-specific model flavors
B.
python_function can be used to deploy models in a parallelizable fashion
C.
python_function can be used to deploy models without worrying about which library was used to create the model
D.
python_function can be used to store models in an MLmodel file
E.
python_function can be used to deploy models without worrying about whether they are deployed in batch, streaming, or real-time environments
Answer:
b
User Votes:
A
50%
B
50%
C 1 votes
50%
D
50%
E
50%
Discussions
0/ 1000
Question 6
Which of the following is a benefit of logging a model signature with an MLflow model?
A.
The model will have a unique identifier in the MLflow experiment
B.
The schema of input data can be validated when serving models
C.
The model can be deployed using real-time serving tools
D.
The model will be secured by the user that developed it
E.
The schema of input data will be converted to match the signature
Answer:
e
User Votes:
A
50%
B 1 votes
50%
C
50%
D
50%
E
50%
Discussions
0/ 1000
Question 7
A machine learning engineer has deployed a model recommender using MLflow Model Serving. They now want to query the version of that model that is in the Production stage of the MLflow Model Registry. Which of the following model URIs can be used to query the described model version?
A.
https:///model-serving/recommender/Production/invocations
B.
The version number of the model version in Production is necessary to complete this task.
C.
https:///model/recommender/stage-production/invocations
D.
https:///model-serving/recommender/stage-production/invocations
E.
https:///model/recommender/Production/invocations
Answer:
b
User Votes:
A
50%
B
50%
C
50%
D
50%
E 1 votes
50%
Discussions
0/ 1000
Question 8
Which of the following deployment paradigms can centrally compute predictions for a single record with exceedingly fast results?
A.
Streaming
B.
Batch
C.
Edge/on-device
D.
None of these strategies will accomplish the task.
E.
Real-time
Answer:
d
User Votes:
A
50%
B
50%
C
50%
D
50%
E 1 votes
50%
Discussions
0/ 1000
Question 9
Which of the following describes label drift?
A.
Label drift is when there is a change in the distribution of the predicted target given by the model
B.
None of these describe label drift
C.
Label drift is when there is a change in the distribution of an input variable
D.
Label drift is when there is a change in the relationship between input variables and target variables
E.
Label drift is when there is a change in the distribution of a target variable
Answer:
c
User Votes:
A
50%
B
50%
C
50%
D
50%
E 1 votes
50%
Discussions
0/ 1000
Question 10
A data scientist has developed a scikit-learn model sklearn_model and they want to log the model using MLflow. They write the following incomplete code block:
Which of the following lines of code can be used to fill in the blank so the code block can successfully complete the task?
A.
mlflow.spark.track_model(sklearn_model, "model")
B.
mlflow.sklearn.log_model(sklearn_model, "model")
C.
mlflow.spark.log_model(sklearn_model, "model")
D.
mlflow.sklearn.load_model("model")
E.
mlflow.sklearn.track_model(sklearn_model, "model")