The editors at Solutions Review have compiled this list of the best Databricks training and courses to consider for 2020.
Databricks is one of the most widely used advanced analytics platforms in the world. Databricks offers a unified analytics platform that allows users to prepare and clean data at scale and continuously train and deploy machine learning models for AI applications. The product handles all analytic deployments, ranging from ETL to models training and deployment. It is also available as a fully managed service on Microsoft Azure and Amazon Web Services.
With this in mind, we’ve compiled this list of the best Databricks training and courses to consider if you’re looking to grow your data analytics skills for work or career advancement. This is not an exhaustive list, but one that features the best Databricks training from trusted online platforms. We made sure to mention and link to related courses on each platform that may be worth exploring as well. Click Go to training to learn more and register.
Description: Databricks offers its own training and course modules, as well as a free Learn From Home program through Databricks Academy. A number of training options are available, including private corporate training, public training, certifications, and self-paced training. Regardless of which you choose, real-world instruction features the use of the actual product. Databricks even has plans to release a mastery-level program in the near future.
Description: In this course you will use the Community Edition of Databricks to explore the platform, understand the difference between interactive and job clusters, and run jobs by attaching applications as jar along with libraries. This course was designed for data engineers who have working knowledge of Apache Spark using Scala, Python or Spark SQL, data scientists with working knowledge of Apache Spark, and IT leaders who want to get started with Apache Spark in the cloud.
Related path/track: Databricks Fundamentals & Apache Spark Core
Platform: Coursera (UC Davis)
Description: This course is for students with SQL experience and now want to take the next step in gaining familiarity with distributed computing using Spark. Students will gain an understanding of when to use Spark and how Spark as an engine uniquely combines Data and AI technologies at scale. The four modules build on one another and by the end of the course the student will understand: Spark architecture, Spark DataFrame, optimizing reading/writing data, and how to build a machine learning model.
Description: In this course, Building Your First ETL Pipeline Using Azure Databricks, you will gain the ability to use the Spark based Databricks platform running on Microsoft Azure, and leverage its features to quickly build and orchestrate an end-to-end ETL pipeline. And all this while learning about collaboration options and optimizations that it brings, but without worrying about the infrastructure management.
Platform: LinkedIn Learning
Description: In this course, Lynn Langit digs into patterns, tools, and best practices that can help developers and DevOps specialists use Azure Databricks to efficiently build big data solutions on Apache Spark. Lynn covers how to set up clusters and use Azure Databricks notebooks, jobs, and services to implement big data workloads. She also explores data pipelines with Azure Databricks—including how to use ML Pipelines—as well as architectural patterns for machine learning.
Platform: Cloud Academy
Description: In this course, Cloud Academy will start by showing you how to set up a Databricks workspace and a cluster. Next, they’ll go through the basics of how to use a notebook to run interactive queries on a dataset. Then you’ll see how to run a Spark job on a schedule. After that, they’ll show you how to train a machine learning model. Finally, they’ll go through several ways to deploy a trained model as a prediction service.