Description
Exam Name: Databricks Certified Data Engineer Associate Exam
Exam Code: Databricks-Certified-Data-Engineer-Associate
Related Certification(s): Databricks Data Engineer Associate Certification
Certification Provider: Databricks
Number of Databricks-Certified-Data-Engineer-Associate practice questions in our database:
Expected Databricks-Certified-Data-Engineer-Associate Exam Topics, as suggested by Databricks :
- Module 1: Databricks Lakehouse Platform: This topic covers the relationship between the data lakehouse and the data warehouse, the improvement in data quality, comparing and contrasting silver and gold tables, elements of the Databricks Platform Architecture, and differentiating between all-purpose clusters and jobs clusters. Moreover, it identifies how cluster software is versioned, how clusters can be filtered, how to use multiple languages, how to run one notebook, how notebooks can be shared, Git operations, and limitations in Databricks Notebooks. Lastly, the topic describes how clusters are terminated, how to use multiple languages, and how Databricks Repos enables CI/CD workflows.
- Module 2: ELT with Apache Spark: It focuses on extracting data, identifying the prefix, creating a view, duplicating rows, creating a new table, utilizing the dot, parsing JSON, and defining a SQL UDF. Moreover, the topic delves into describing the security model, identifying the location of a function, and identifying the PIVOT.
- Module 3: Incremental Data Processing: In this topic questions about identifying Delta Lake, benefits of ACID transactions, a scenario to use an external table, location of a table, the benefits of Zordering, the kind of files, CTAS as a solution, the impact of ON VIOLATION DROP ROW and ON VIOLATION FAIL UPDATE, and the necessary component to create a new DLT pipeline. Moreover, the topic also discusses directory structure of Delta Lake files, generated column, adding a table comment, and the benefits of the MERGE command.
- Module 4: Production Pipelines: It focuses on identifying the advantages of using multiple tasks in Jobs, a suitable scenario where predecessor task should be set up, CRON as an opportunity for scheduling opportunity, and how an alert can be sent via email. The topic also discusses setting up a predecessor task in Jobs, reviewing a task’s execution history, and debugging a failed task. Lastly, it delves into setting up a retry policy in case of failure and creating an alert in the case of a failed task.
- Module 5: Data Governance: It identifies one of the four areas of data governance, Unity Catalog securables, and the cluster security modes. It also discusses how to create a UC-enabled all-purpose cluster and a DBSQL warehouse. The topic explains how to implement data object access control, create a DBSQL warehouse, and e a UC-enabled all-purpose cluster.
Reviews
There are no reviews yet.