Professional Documents
Culture Documents
Semester -1
1
Semester 1 Project work:
SYLLABI
SEMESTER I
2. Raptor 15 Hrs
Program design and development process
Problem definition
Pseudo-code
Flowcharting
Code modularization
Coding, testing, and debugging
Sequence, selection, and iteration patterns
Array processing
File processing
Values and Variables
Integer Values
2
Variables and Assignment
Identifiers
Additional Integer Types
Floating-point Types
Constants
Other Numeric Types
Characters
Enumerated Types
Expressions and Arithmetic
Expressions
Mixed Type Expressions
Operator Precedence and Associativity
Comments
Compile-time Errors
Run-time Errors
Logic Errors
Compiler Warnings
Arithmetic Examples
Integer Implementation
Floating-point Implementation
Bitwise Operators
Algorithms
Conditional Execution
Type bool
Boolean Expressions
The Simple if Statement
Compound Statements
The if/else Statement
Compound Boolean Expressions
Nested Conditionals
Multi-way if/else Statements
Iteration
The while Statement
Nested Loops
3
Abnormal Loop Termination
The break statement
The goto Statement
The continue Statement
Infinite Loops
Iteration Examples
Drawing a Tree
Printing Prime Numbers
Using Functions
Introduction to Using Functions
Standard Math Functions
Maximum and Minimum
clock Function
Character Functions
Random Numbers
Arrays
Static Arrays
Pointers and Arrays
Dynamic Arrays
Copying an Array
Multidimensional Arrays
Command-line Arguments
Vectors vs. Arrays
Prime Generation with a Vector
Custom Objects
Object Basics
Instance Variables
Member Functions
Constructors
Defining a New Numeric Type
Encapsulation
Handling Exceptions
4
Motivation
Exception Examples
Custom Exceptions
Catching Multiple Exceptions
Exception Mechanics
Using Exceptions
5
Scope
Access Modifiers
The static statement
The final statement
Java Collections
Binary Search
Collections List Methods
Comparable and Comparator
Maps
Immutable Classes
Sets & HashSet
Sorted Collections
TreeMap and Unmodifiable Maps
Euler project (Intermediate – 20)
6
Books & References
Text books:
1. Java for Programmers, Dietel and Dietel, Prentice Hall, 2016
Reference Books:
1 : Linux Basics
Introduction
Linux and the Operating System
Graphical Environments and Interfaces
Getting Help
Text Editors
Shells, bash, and the Command Line
System Components
System Administration
Essential Command Line Tools
Command and Tool Details
Users and Groups
Bash Scripting
Files and Filesystems
Linux Intermediate
Filesystem Layout
Linux Filesystems
Compiling, Linking and Libraries
Java Installation and Environment
Python and dependency installation
Building RPM and Debian Packages
Introduction to GIT
Git Installation
Git and Revision Control Systems
Using Git: an Example
Git Concepts and Architecture
Managing Files and the Index
7
Commits
Branches
Diffs
Merges
Managing Local and Remote Repositories
Using Patches
3 : Basic Python
Importing Data from various sources (Csv, txt, excel, access etc)
Database Input (Connecting to database)
Viewing Data objects - subsetting, methods
Exporting Data to various formats
Important python modules: Pandas, beautifulsoup
Cleansing Data with Python
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables,
sampling, Data type conversions, renaming, formatting etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions
Stripping out extraneous information
Normalizing data
Formatting data
Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)
8
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats
etc)
9
What is HPE IDOL
HPE IDOL use cases
Complexity of the powerful infrastructure software by examining the technology from a high-level
perspective
Understand the architecture of HEP IDOL
Different components used
Understanding of license server and its validity
HPE IDOL Server configuration
Different types of connectors supported and its uses based on the respective ports.
3. Social media
10
Unit Work: By using users created in the respective social media websites such as Facebook, Twitter with the help
of Facebook social media connector, Twitter social media connector and will have an assignment to have a LinkedIn
social media connector.
Unit IX: Retrieval using IDOL Find, end user search Interface
Advanced search using the functions, and indexing the parametric data
Parametric parameters and its uses
4.Architecture
11
Understanding the prerequisites of hardware and software
Understanding various configurations and services of Hadoop.
Understand difference between the regular file system and Hadoop distributed file system.
Concept of MapReduce
Different roles of the user
Work out with jobtracker and tasktracker
Flow of MapReduce
Different concepts of MapReduce.
5.Advanced concepts
1. Introduction to HBase
o The problem with distributed computing
o Installing HBase
o The role of HBase in the Hadoop ecosystem
o How is HBase different from RDBMS?
o HBase Data Model
o Introducing CRUD operations
o HBase is different from Hive
12
CRUD operations using the HBase Shell
13
Advanced Big Data – 1 (SPARK – Machine Learning) XX Hours
1 : Introduction to Spark
4: Spark graph
14
GraphFrames Property Graph analysis
Project – Social network analysis
5. Deploying in Production
Spark deployment models
High availability and fault tolerance
Monitoring streaming jobs
Summary
15
Reference books:
Learning Real-time Processing with Spark Streaming, Sumit Gupta, September 2015, Packt
books
1: Introduction
Machine learning: goals, results, supervised/unsupervised - · Spark as a tool for Big Data -
Python as the language of Spark – Spark structures for machine learning – sample use cases –
life cycle of machine learning – data preparation for machine learning – binning – outlier
treatment – missing values – binning – feature selection
2: Linear methods
Logistic Regression – Odds – Log odds – linear relations – assumptions in logistic regression –
L1/L2 regularization · Use case: healthcare prediction
SVM (Support Vector Machines) – risk boundaries – hyper parameters – Vapnik space – SVM
classifier – SVM regression – Linear SVM – non linear SVM – single class SVM - Use case:
anomaly detection
Decision Trees – selection of variable – impurity measures – entrophy – gini -chi square –
splitting variables - Use case: Diabetes diagnosis
Random forests – diversity metrics – voting – bagging – boosting – random trees - Use case:
Credit scoring
4: Unsupervised methods
Clustering (K-Means) – distance metrics -Euclidean – City block – power distance – similarity
metrics – dissimilarity metrics – number of clusters – elbow criterion – cluster validity –
applications of clustering - Use case: topic grouping
16
Principal Component Analysis (PCA) – dimension reduction – covariance matrix – linaer
combinations – new variables – assumptions in PCA – non linear PCA - Use case: stock analysis
17