Syllabus

MA589 Statistical Foundations for Data Science

Course Code: MA589 Course Name: Statistical Foundations for Data Science Credits: 3-0-0-6
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus:

Probability spaces, conditional probability, independence; Random variables, distribution functions, probability mass and density functions, functions of random variables, standard univariate discrete and continuous distributions; Mathematical expectations, moments, moment generating functions, inequalities. Random vectors, joint, marginal and conditional distributions, conditional expectations, independence, covariance, correlation, standard multivariate distributions, functions of random vectors; Law of large numbers, central limit theorem. Sampling distributions; Point estimation – estimators, minimum variance unbiased estimation, maximum likelihood estimation, method of moments, consistency; Interval estimation; Testing of hypotheses – tests and critical regions, likelihood ratio tests; Linear regression.

Textbooks:
  • B. L. S. Prakasa Rao, A First Course in Probability and Statistics, World Scientific / Cambridge University Press India, 2009.
  • R. V. Hogg, J. W. McKean, and A. Craig, Introduction to Mathematical Statistics, 6th Ed., Pearson Education India, 2006.
References:
  • Additional readings and instructor-provided material during the course.

MA579H Scientific Computing

Course Code: MA579H Course Name: Scientific Computing Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus:

Definition and sources of errors; Solutions of nonlinear equations – Bisection method, Newton's method and its variants, fixed point iterations, convergence analysis; Newton's method for non-linear systems. Finite differences and polynomial interpolation; Numerical integration – Trapezoidal and Simpson's rules, Gaussian quadrature. Initial value problems – Taylor series method, Euler and modified Euler methods, Runge-Kutta methods.

Textbooks:
  • D. Kincaid and W. Cheney, Numerical Mathematics and Computing, 7th Edition, Cengage, 2013.
  • K. E. Atkinson, Introduction to Numerical Analysis, 2nd Edition, John Wiley, 1989.
References:
  • Additional lecture notes and material as provided by the instructor.

MA580H Matrix Computations

Course Code: MA580H Course Name: Matrix Computations Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus:

Linear systems – All variants of Gaussian elimination and LU factorization, Cholesky factorization. Linear least-squares problems – Normal equations, rotators and reflectors, QR factorization via rotators, reflectors and Gram-Schmidt orthonormalisation; QR method for linear least-squares problems; Rank-deficient least-squares problems. Singular Value Decomposition (SVD) – Numerical rank determination via SVD, solution of least squares problems, Moore-Penrose inverse, low-rank approximations using SVD, Principal Component Analysis, applications to data mining and image recognition. Eigenvalue Decomposition – Power, inverse power and Rayleigh quotient iterations, Schur decomposition, unitary similarity transformations of Hermitian matrices to tridiagonal form, QR algorithm, implementation of explicit QR algorithm for Hermitian matrices.

Textbooks:
  • L. N. Trefethen and David Bau, Numerical Linear Algebra, SIAM, Philadelphia, 1997.
  • D. S. Watkins, Fundamentals of Matrix Computation, 2nd Edition, Wiley, 2002.
  • L. Elden, Matrix Methods in Data Mining and Pattern Recognition, SIAM, Philadelphia, 2007.
References:
  • Additional lecture notes and research articles as recommended by the instructor.

CS591H Data Structures and Algorithms

Course Code: DA511H Course Name: Data Structures and Algorithms Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus:

Review of fundamental data structures. Models of computation: Random access machines, space and time complexity measures, lower and upper bounds. Algorithm design techniques: Greedy method, divide-and-conquer, dynamic programming, and backtracking. Sorting and searching algorithms. Graph algorithms including traversal, shortest path, and spanning trees. Hashing techniques: Separate chaining, linear probing, and quadratic probing. Search trees: Binary search trees, AVL trees, and B-trees.

Textbooks:
  • T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 3rd Edition, MIT Press, 2009.
  • Jon Kleinberg and Éva Tardos, Algorithm Design, 1st Edition, Pearson Education, 2005.
References:
  • Supplementary lecture notes and instructor-recommended research papers.

CS592H Databases

Course Code: DA512H Course Name: Databases Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus:

Data Models: Overview of data models with emphasis on the relational model. Database Design: Conceptual design using the Entity-Relationship (E-R) model; mapping E-R models to relational schemas. Relational Algebra and Calculus: Formal query languages for relational databases. SQL: SQL queries, constraints, and triggers. Application Development: Stored procedures and database programming concepts.

Textbooks:
  • R. Ramakrishnan and J. Gehrke, Database Management Systems, 3rd Edition, McGraw Hill, 2003.
References:
  • Instructor-provided lecture notes and database project guides.

DA513 Data Structures and Databases Lab

Course Code: DA513 Course Name: Data Structures and Databases Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus: Programming assignments are based on the theory courses CS 591H Data Structures and Algorithms and CS 592H Databases.
Textbooks:
  • T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein, Introduction to Algorithms, 3rd Edition, MIT Press, 2009.
  • Jon Kleinberg and Eva Tardos, Algorithm Design, 1st Edition, Pearson Education, 2006.
  • R. Ramakrishnan and J. Gehrke, Database Management Systems, 3rd Edition, McGraw Hill, 2003.
References:
  • Additional materials and assignments provided by the instructor.

DA514 Python Programming Lab

Course Code: DA514 Course Name: Python Programming Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus: Fundamental concepts: Literals, variables and identifiers, operators, expressions and data types; Control structures: Boolean expressions, selection control, iterative control; Lists: List structures, Lists (sequences), iterating over lists; Functions: Program routines, calling value-returning functions, calling non value-returning functions, parameter passing, variable scope; Dictionaries and Sets; Recursion; Text Files: Using text files, string passing, exception handling.
Textbooks:
  • Charles Dierbach, Introduction to Computer Science Using Python: A Computational Problem Solving Focus, John Wiley & Sons, 2012.
References:
  • Additional reading and programming exercises provided by the instructor.

MA581 Numerical Computations Lab

Course Code: MA581 Course Name: Numerical Computations Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Odd Semester
Syllabus: Programming assignments are based on the theory courses MA 579H Scientific Computing and MA 580H Matrix Computation.
Textbooks:
  • L. N. Trefethen and David Bau, Numerical Linear Algebra, SIAM, 1997.
  • D. S. Watkins, Fundamentals of Matrix Computation, 2nd Edition, Wiley, 2002.
  • D. Kincaid and W. Cheney, Numerical Analysis: Mathematics of Scientific Computing, 3rd Edition, AMS, 2002.
  • K. E. Atkinson, Introduction to Numerical Analysis, 2nd Edition, John Wiley, 1989.
References:
  • Additional programming materials as suggested during the course.

EE595H Stochastic Models

Course Code: EE595H Course Name: Stochastic Models Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Even Semester
Syllabus: Stochastic Processes: Definition and classification of random processes; Discrete-time Markov chains; Poisson process; Continuous-time Markov chains; Bayesian statistics; Monte Carlo; Gibbs Sampler: data augmentation, burn-in, convergence; Metropolis-Hastings algorithm: independent sampler, random walk Metropolis, scaling, multi-modality; Approximate Bayesian Computation.
Textbooks:
  • Sheldon M. Ross, Stochastic Processes, Wiley, 1995.
  • W. R. Gilks, S. Richardson, and D. Spiegelhalter, Markov Chain Monte Carlo Methods in Practice, Chapman and Hall.
References:
  • Additional readings suggested during lectures.

EE596H Optimization Techniques

Course Code: EE596H Course Name: Optimization Techniques Credits: 3-0-0-3
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Even Semester
Syllabus: Optimization - sequences and limits, derivative matrix, level sets and gradients, Taylor series; unconstrained optimization - necessary and sufficient conditions for optima, convex sets, convex functions, optima of convex functions, steepest descent, Newton and quasi Newton methods, conjugate direction methods; constrained optimization - linear and non-linear constraints, equality and inequality constraints, optimality conditions, constrained convex optimization, projected gradient methods, penalty methods.
Textbooks:
  • E. K. P. Chong and S. H. Zak, An Introduction to Optimization, 2nd Edition, Wiley India Pvt. Ltd., 2010.
  • D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd Edition, Springer, 2010.
References:
  • Additional readings will be suggested during the course.

EE526 Machine Learning

Course Code: EE526 Course Name: Machine Learning Credits: 3-0-0-6
Pre-requisite: None Offered to: M.Tech. (Data Science) Offered in: Even Semester
Syllabus: Introduction to learning; Bayesian Classification; Feature Selection; PCA; K-Means Clustering; DBSCAN; Hierarchical Agglomerative Clustering; GMM; Mean-shift Clustering; Multilayer Perceptron; RBF Networks; Classification Performance Analysis; Decision Trees; SVM; Introduction to Multiple Kernel Learning; Ensemble Methods – Bagging and Boosting, Hidden Markov Models; Introduction to CNN and RNN; Introduction to Reinforcement Learning.
Textbooks:
  • E. Alpaydin, Introduction to Machine Learning, 3rd Edition, Prentice Hall (India), 2015.
  • R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, 2nd Edition, Wiley India, 2007.
  • C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
  • S. O. Haykin, Neural Networks and Learning Machines, 3rd Edition, Pearson Education (India), 2016.
  • J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.
  • I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2017.
  • R. Sutton, Reinforcement Learning – An Introduction, MIT Press, 1998.
References:
  • Relevant research papers in the area of Machine Learning.

EE527 Machine Learning Lab

Course Code: EE527 Course Name: Machine Learning Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (DS) Offered in: Even Semester
Syllabus: Design of experiments in Machine Learning; Introduction to popular Machine Learning Datasets and Toolkits; Face Recognition using PCA; Practical applications of clustering; Experiments on supervised classification using MLP, RBF ANN, SVM and Decision Trees; Application of Classifiers Ensembles; Sequence classification using HMM; Applications of CNN and RNN; Path planning with Reinforcement Learning.
Textbooks:
  • To be provided during the course.
References:
  • Additional materials and readings will be suggested during the course.

MA588 R Programming Lab

Course Code: MA588 Course Name: R Programming Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (DS) Offered in: Even Semester
Syllabus: Introduction to R: basic commands, graphics, indexing data, loading data; Regression: linear regression, test of significance, residual analysis, polynomial regression, qualitative predictor, logistic regression; Resampling methods: cross-validation, bootstrap; Subset selection: best subset selection, forward and backward stepwise selection, choosing among models using validation; Markov chain Monte Carlo. Optimization in R: Common R Packages for Linear, Quadratic and Non-linear optimization, built-in optimization functions, Linear Programming in R: lpsolve, Quadratic Programming: quadprog, Non-Linear Optimization: One-Dimensional: Golden Section Search; Multi-dimensional: Gradient-based, Hessian-based, Non-gradient based.
Textbooks:
  • G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R, Springer, 2013.
  • W. John Braun and Duncan J. Murdoch, A First Course in Statistical Programming with R, Cambridge University Press, 2008.
  • R Optimization Packages and Resources
References:
  • Additional readings and materials will be suggested during the course.

DA531 Data Visualization Lab

Course Code: DA531 Course Name: Data Visualization Lab Credits: 0-0-3-3
Pre-requisite: None Offered to: M.Tech. (DS) Offered in: Even Semester
Syllabus: Defining data visualization; Visualization workflow: describing data visualization workflow, process in practice; Data Representation: chart types: categorical, hierarchical, relational, temporal & spatial; 2-D: bar charts, clustered bar charts, dot plots, connected dot plots, pictograms, proportional shape charts, bubble charts, radar charts, polar charts, range chart, box-and-whisker plots, univariate scatter plots, histograms, word cloud, pie chart, waffle chart, stacked bar chart, back-to-back bar chart, treemap and all relevant 2-D charts. 3-D: surfaces, contours, hidden surfaces, pm3d coloring, 3D mapping; multi-dimensional data visualization; manifold visualization; graph data visualization; annotation.
Textbooks:
  • Andy Kirk, Data Visualization: A Handbook for Data Driven Design, Sage Publications, 2016.
  • Philipp K. Janert, Gnuplot in Action: Understanding Data with Graphs, Manning Publications, 2010.
References:
  • Additional readings and materials will be suggested during the course.

Elective Pool

The following is a list of elective courses offered by the Departments of Electronics and Electrical Engineering (EEE), Mathematics, and the Mehta Family School of Data Science and Artificial Intelligence (MFSDSAI). These electives are available for M.Tech. in Data Science students, subject to departmental approval and seat availability. Please note that the actual list of electives available in a given semester may vary. An updated and final list of electives is typically shared by the respective departments during the course registration period. For detailed syllabi of these courses—and to explore additional offerings—students are encouraged to visit the official websites of the respective departments.

Department Course Code Course Name Credits
Electronics and Electrical Engineering EE624 Image Processing 3-0-0-6
EE625 Computer Vision 3-0-0-6
EE626 Biomedical Signal Processing 3-0-0-6
EE627 Speech Signal Processing and Coding 3-0-0-6
EE657 Pattern Recognition For Machine Learning 3-0-0-6
EE660 Biometrics 3-0-0-6
EE664 Introduction to Parallel Computing 3-0-0-6
EE692 Detection and Estimation Theory 3-0-0-6
Mathematics MA504 Combinatorial Optimization 3-0-0-6
MA544 Wavelets and Applications 3-0-0-6
MA562 Mathematical Modeling and Numerical Simulation 3-0-0-6
MA576 Large Scale Scientific Computation 3-0-0-6
MA577 Perturbation Methods 3-0-0-6
MA593 Statistical Methods and Time Series Analysis 3-0-0-6
MA601 Graphs and Matrices 3-0-0-6
MA681 Applied Stochastic Processes 3-0-0-6
MA682 Statistical Inference 3-0-0-6
MA691 Advanced Statistical Algorithms 3-0-0-6
MFSDSAI DA547 Introduction to Mathematical Biology 3-0-0-6
DA621 Deep Learning for Computer Vision 3-0-0-6
DA622 Robustness and Interpretability in Machine Learning 3-0-0-6
DA623 Computing with Signals 3-0-0-6
DA624 NLP with Large Language Models 3-0-0-6
DA625 Special Topics in Natural Language Processing 3-0-0-6
DA626 Recommendation System Design 3-0-0-6
DA527 Neuromorphic Artificial Intelligence 3-0-0-6
DA546 Introduction to Statistical Learning 3-0-0-6
DA526 Image Processing with Machine Learning 3-0-0-6
DA651 Artificial Intelligence for Next-Generation Wireless Systems 3-0-0-6
DA652 Information and Inference 3-0-0-6
DA641 Non-linear Regression 3-0-0-6
DA672 Data-driven System Theory 3-0-0-6
DA642 Time-series Analysis 3-0-0-6
DA671 Introduction to Reinforcement Learning 3-0-0-6
DA674 Advanced Topics in Reinforcement Learning 3-0-0-6
DA673 Neural Data Analysis 3-0-0-6
DA675 Fuzzy Systems and Applications 3-0-0-6
DA771 Machine Learning in Quantum Physics 3-0-0-6