Expert Trainer | Hands-on Training
duration
• What is the use of Databricks?
• Spark Architecture
• Workspace
• Types of Clusters and Runtimes
• Notebooks
• Jobs
• Upload files into DBFS
• dbutils.fs
• dbutils.data
• dbutils.data.summarize
• dbutils.fs.cp
• dbutils.fs.head
• dbutils.fs.mkdirs
• dbutils.notebook.run()
• dbutils.notebook.exit()
• dbutils.data
• dbutils.data.summarize
• dbutils.fs.cp
• dbutils.fs.head
• dbutils.fs.mkdirs
• Dbutils.widgets.combobox
• Dbutils.dropdown.dropdown
• dbutils.dropdown.multiselect
• dbutils.dropdown.text
• dbutils.dropdown.get
• dbutils.dropdown.getArgument
• dbutils.widgets.remove
• Create mount point using Account Key
• Create mount point using SAS Token
• Connect ADLS Gen2 to Databricks
• Delete or Unmount Mount Points in Azure Databricks
• mounts() & refreshMounts() commands of File system
Utilities in Azure Databricks
• Update Mount Point(dbutils.fs.updateMount()) in Azure
Databricks
• Azure Key vault backed scopes
• Databricks-backed scopes
• Pass Parameter to Notebook from ADF Pipeline
• Send Exception or Error message from Pipeline to
Notebook
• Send parameters from Notebook to ADF Pipelines
• Time Travel/Versioning in Delta Lake
• RESTORE
• Get Delta Lake History
• Vacuum Command in Delta Lake
• Merge Command in Delta Lake
• Schema evaluation in Delta Lake
• Change Data feed in Delta Lake
• Z-Order index
• Table constraints
• Runbooks
• Run SQL Commands from Runbook
• Run Pipeline from Runbook
• Optimize performance with caching on Azure
Databricks
• Dynamic file pruning
• Low shuffle merge
• Adaptive query execution
• predictive IO
• Cost-based optimizer
• Auto optimize
• Higher-order functions
• Isolation levels
• column mapping
• Parallelize
• Collect()
• Repartition vs coalesce
• Broadcast variables
• Accumulator
• RDD Transformations
• Transformations Actions
• Create an Empty DataFrame
• Create Empty DataFrame with Schema
• Convert Empty RDD to DataFrame
• show()/Display()
• StructType() and StructField()
• Column class
• Select()
• WithColumn()
• WithColumnRenamed()
• Where() and Filter()
• Drop() and DropDuplicates()
• orderBy() and sort()
• groupBy()
• Join()
• union() and unionAll()
• unionByName()
• map()
• flatmap()
• fillna() and fill()
• pivot()
• partitionBy()
• mapType()
• foreach()
• User Defined Functions
• Aggregate functions
• Window functions
• Date and Timestamp functions
• JSON functions
• Read and write CSV File
• Read and Write Parquet File
• Read and Write JSON File
• Read Hive Table
• Save to Hive Table
• Read JDBC in parallel
• Query Database Table
• Read and Write SQL Server
• Read JDBC Table
• when()
• expr()
• lit()
• split()
• concat_ws()
• substring()
• translate()
• regexp_replace()
• to_timestamp()
• to_date()
• date_format()
• struct()
• countDistinct()
• sum(), avg()
• row_number()
• rank()
• dense_rank()
• from_json()
• to_json()
• json_tuple()
• get_jason_object()
• schema_of_json()
• array()
• collect_list()
• collect_set()
• create_map()
• map_keys()
• map_values()
• months_between()
• explode()
• array_constraints()
• datediff()
• Autoloader → Ingest data efficiently
into Delta tables.
• Unity Catalog → Govern and secure
those Delta tables across
workspaces/clouds.
• Mounting → Provide simple,
persistent access to external
storage locations.
• CDC (Change Data Capture) in
Databricks
• CDC Approaches in Databricks
• Implementing CDC in Databricks
with Delta Lake
• Medallion Architecture in Databricks
• What is it?
• Layers in Medallion Architecture
• Key Benefits
UPI, Net Banking, Debit/Credit Cards, and EMI options are available.
Generally, recordings are accessible for 3 to 6 months post-course completion.