This course is designed as an entry point for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Spark. Topics include: An overview of the Hortonworks Data Platform (HDP), including HDFS and YARN; using Spark Core APIs for interactive data exploration; Spark SQL and DataFrame operations; Spark Streaming and DStream operations; data visualization, reporting, and collaboration; performance monitoring and tuning; building and deploying Spark applications; and an introduction to the Spark Machine Learning Library.

We can organize this training at your preferred date and location. Contact Us!

PHILIP MORRIS SABANCI PAZARLAMA VE SATIS A.S

PAPİLON SAVUNMA GUVENLIK SISTEMLERI BILISIM HİZM. İHR. SAN. VE TİC. A.Ş.

İNFRASİS BİLGİ TEKNOLOJİLERİ TİC. LTD. ŞTİ.

Viessmann Isı Teknolojileri San. ve Tic Ltd. Şti.

North Caspian Operating Company N.V. (NCOC N.V.)

Prerequisites

Students should be familiar with programming principles and have previous experience in software development using either Python or Scala. Previous experience with data streaming, SQL, and HDP is also helpful, but not required.

Who Should Attend

Software engineers that are looking to develop in-memory applications for time sensitive and highly iterative applications in an Enterprise HDP environment.

What You Will Learn

At the completion of the course students will be able to:

Describe Hadoop, HDFS, YARN, and the HDP ecosystem
Describe Spark use cases
Explore and manipulate data using Zeppelin
Explore and manipulate data using a Spark REPL
Explain the purpose and function of RDDs
Employ functional programming practices
Perform Spark transformations and actions
Work with Pair RDDs
Perform Spark queries using Spark SQL and DataFrames
Use Spark Streaming stateless and window transformation
Visualize data, generate reports, and collaborate using Zeppelin
Monitor Spark applications using Spark History Server
Learn general application optimization guidelines/tips
Use data caching to increase performance of applications
Build and package Spark applications
Deploy applications to the cluster using YARN
Understand the purpose of Spark MLlib

Training Outline

Use common HDFS commands
Use a REPL to program in Spark
Use Zeppelin to program in Spark
Perform RDD transformations and actions
Perform Pair RDD transformations and actions
Utilize Spark SQL* Perform stateless transformations using Spark Streaming
Perform window-based transformations
Use Zeppelin for data visualization and reporting
Monitor applications using Spark History Server
Cache and persist data
Configure checkpointing, broadcast variables, and executors
Build and submit a Spark application to YARN
Run Spark MLlib applications

Why have you chosen us?

I have attended a training from Bilginc IT Academy before and I was satisfied.

I have attended a training from a different provider and it was not helpful.

Other

How many employees do you have in your IT department?

0 – 50

50 – 250

250 – 1000

1000+

Avaible Training Dates

Join our public courses in our Istanbul, London and Ankara facilities. Private class trainings will be organized at the location of your preference, according to your schedule.

We can organize this training at your preferred date and location.

25 November 2025 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

27 November 2025 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

04 January 2026 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

02 February 2026 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

28 February 2026 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

20 March 2026 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

14 September 2026 (4 Days)
Istanbul, Ankara, London
Classroom / Virtual Classroom

HDP Developer: Enterprise Apache Spark I Training

Prerequisites

Who Should Attend

What You Will Learn

Training Outline

Avaible Training Dates

AN OVERVIEW OF THE HORTONWORKS DATA PLATFORM (HDP)