Hadoop Apache Spark & Scala Developer

User Avatar
Free

COURSE DESCRIPTION

Scala Training with Apache Spark” is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs.

It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

Scala combines object-oriented and functional programming in one concise, high-level language. Scala’s static types help avoid bugs in complex applications, and its JVM and JavaScript runtimes let you build high-performance systems with easy access to huge ecosystems of libraries.

TARGET AUDIENCE

  • Fresh Graduates
  • Data Engineers
  • Software developers
  • ETL developers

Scala Course Content

module 1 : Introduction of Scala

  • Introducing Scala
  • Deployment of Scala for Big Data applications
  • Apache Spark analytics.

module 2 : Pattern Matching

  • The importance of Scala
  • The concept of REPL (Read Evaluate Print Loop
  • Deep dive into Scala pattern matching
  • Type interface
  • Higher order function
  • Currying
  • Traits
  • Application space
  • Scala for data analysis.

module 3 : Executing the Scala code

  • Scala Interpreter
  • Static object timer in Scala
  • Testing String equality in Scala
  • Implicit classes in Scala
  • The concept of currying in Scala
  • Various classes in Scala.

module 4 : The Classes concept in Scala

  • Classes concept
  • understanding the constructor overloading
  • the various abstract classes
  • The hierarchy types in Scala
  • The concept of object equality
  • The Val and var methods in Scala.

module 5 : Case classes and pattern matching

  • Understanding Sealed traits
  • Wild,
  • Constructor,
  • Tuple,
  • Variable pattern
  • Constant pattern.

module 6 : Concepts of traits with an example

  • Traits in Scala
  • The advantages of traits,
  • Linearization of traits,
  • The Java equivalent
  • Avoiding of boilerplate code.

module 7 : Scala Java Interoperability

  • Implementation of traits in Scala and Java
  • Handling of multiple traits extending.

module 8 : Scala Training Collections

  • Introduction to Scala collections
  • Classification of collections
  • Difference between Iterator, and Iterable in Scala
  • Example of list sequence in Scala.

module 9 : Mutable collections vs. Immutable collections

  • The importance of Scala
  • The concept of REPL (Read Evaluate Print Loop
  • Deep dive into Scala pattern matching
  • Type interface
  • Higher order function
  • Currying
  • Traits
  • Application space
  • Scala for data analysis.

module 10 : Use Case bobsrockets package

  • Introduction to Scala packages and imports
  • The selective imports
  • The Scala test classes
  • Introduction to JUnit test class
  • JUnit interface via JUnit 3 suite for Scala test
  • The packaging of Scala applications in Directory Structure
  • Example of Spark Split and Spark Scala.

module 11 : Spark framework comparing Scala

  • Detailed Apache Spark
  • Various features
  • Comparing with Hadoop
  • Various Spark components,
  • Combining HDFS with Spark
  • Scalding

module 12 : RDD in Spark using Scala

  • The RDD operation in Spark
  • The Spark transformations, actions, data loading,
  • Comparing with MapReduce
  • Key Value Pair.

module 13 : Concepts of traits with an example

  • The detailed Spark SQL
  • The significance of SQL in Spark for working with structured data processing
  • Spark SQL JSON support
  • Working with XML data, and parquet files
  • Creating HiveContext,
  • Writing Data Frame to Hive
  • Reading of JDBC files
  • The importance of Data Frames in Spark
  • Creating Data Frames
  • Schema manual inferring
  • Working with CSV files
  • Reading of JDBC tables
  • Converting from Data Frame to JDBC
  • The user-defined functions in Spark SQL
  • Shared variable and accumulators
  • How to query and transform data in Data Frames
  • How Data Frame provides the benefits of both Spark RDD and Spark SQL
  • Deploying Hive on Spark as the execution engine.

module 14 : 14 Spark Streaming using Scala

  • Introduction to Spark streaming
  • The architecture of Spark Streaming
  • Working with the Spark streaming program
  • Processing data using Spark streaming
  • Requesting count and Dstream
  • Multi-batch and sliding window operations
  • Working with advanced data sources.

module 15 : Scala Training Outcome

  • The participant will be able to do Big data analytics using Spark and Scala

Course Features

  • Lectures 0
  • Quizzes 0
  • Students 1
  • Assessments Yes
Curriculum is empty
Free

Leave A Reply

Your email address will not be published. Required fields are marked *