
BSc (Hons) Data Science and AI Course Syllabus
DA101 Basic English
Course Code: DA101 | Course Name: Basic English | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Listening: What is listening, difference between listening and speaking, barriers to listening, effective listening strategies, comprehending social conversation, comprehending narrations and academic lectures; Speaking: Understanding accent (intelligibility, Indian and non-Indian accents), nuances of fluency; understanding effective speaking strategies, using language in various situations such as - introducing oneself and others in formal and informal situations, asking for and giving information, describing people, places and objects, narrating events, explaining processes and products, expressing opinions, arguing, giving instructions, participating in conversations and group discussions, understanding turn-taking strategies, making short presentations; Reading: Reading simple narratives and comprehending the gist, identifying topic sentences, identifying cohesive devices and their functions, comprehending texts of different genres; Vocabulary: understanding different aspects of a word, learning strategies to develop vocabulary, using dictionaries; Grammar: articles, quantifiers, punctuation, tenses, gerunds, infinitives, participles, subject-verb agreement, adverbs, nouns, pronouns, prepositions, connectives, adjectives, common errors; Writing: writing paragraphs, narratives, summarizing, paraphrasing, note-taking, note-making, reviews, and short reports. | ||
Textbooks:
|
||
References:
|
DA102 Data Analysis Basics
Course Code: DA102 | Course Name: Data Analysis Basics | Credits: 2-0-4-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Different forms of data (structured, unstructured, temporal and spatial) and their representation. Data files and their formats: csv, xlsx, text files, coded files, etc. Cleansing of data (missing values, duplicates, null values, data not in proper format, misplaced delimiter, embedded space characters or nonprinting characters, removal of unnecessary spaces), Data wrangling. Statistical analysis of data: tabulation, measure of central tendency (mean, median, mode), measure of dispersion and variance (range, variation, standard deviation), measure of skewness, time-series analysis using tools such as Excel and Spreadsheet. Data visualization for decision making: organizing and summarizing data using charts and graphs, PivotTables and Pivot Charts. Curve fitting and regression. | ||
Textbooks:
|
||
References:
|
DA103 Introduction to Statistics
Course Code: DA103 | Course Name: Introduction to Statistics | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Basics of probability and random variables, distribution functions, probability mass and density functions, functions of random variables, standard univariate discrete and continuous distributions; Mathematical expectations, moments, moment generating functions, inequalities; Two dimensional random variables, joint, marginal and conditional distributions, conditional expectation, independence, covariance, correlation; Law of large numbers, Central limit theorem. | ||
Textbooks:
|
||
References:
|
DA104 C Programming
Course Code: DA104 | Course Name: C Programming | Credits: 2-0-4-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Basic structure of a C program, executing a C program; data types, operators and expressions: C tokens, keywords and identifiers, variables and constants, data types and sizes, declaration of variables and assigning values, symbolic constants, arithmetic, relational and logical Operators, type conversions, increment and decrement operators, bitwise operators, assignment operators and expressions, conditional expressions, precedence and order of evaluation; Branching and looping: if statement, if-else statement, nesting of if-else statements, switch statement, loops – while, for and do-while, break and continue, goto and labels; Functions and Program Structure: basics of C functions, return values and their types, external variables, header files, recursion, the C preprocessor, pointers and arrays, address arithmetic, command-line arguments, pointers to functions, basics of structures, pointers to structures; Input and Output: standard input and output, formatted input and output (scanf and printf), file access, error handling. | ||
Textbooks:
|
||
References:
|
DA105 Linear Algebra
Course Code: DA105 | Course Name: Linear Algebra | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Systems of linear equations, matrices, Solving systems of linear equations: Gaussian elimination, echelon form, column space, null space, rank of a matrix, inverse, determinant and their properties, Cramer’s rule; Vector spaces (over the field of real and complex numbers), subspaces, spanning set, linear independence, basis and dimension; Linear transformations, rank-nullity theorem, matrix of a linear transformation, change of basis and similarity; Eigenvalues and eigenvectors, algebraic and geometric multiplicity, similarity and diagonalization; Inner-product spaces, Gram-Schmidt process, orthonormal basis; Orthogonal, Hermitian and symmetric matrices, positive definite matrices, QR factorization, singular value decomposition. Introduction to Matrix Calculus. | ||
Textbooks:
|
||
References:
|
DA106 Data Science: An Introduction
Course Code: DA106 | Course Name: Data Science: An Introduction | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Overview of Data Science, Data Science task workflow; Data collection and management: Nature of data sets; Multimodal data, structured and unstructured data; Data collection and curation; Concepts in data management: XML and JSON file formats, basics of SQL; Data cleaning, exploration, plots: Reading and exporting data, cleaning data; Exploratory data analysis: Missing values, Outlier detection, Data Transformation. Model building, training, evaluation: Machine Learning Task Workflow, train/validation/test set preparation, Linear regression and Bayes Classifier; Model training and performance analysis; Examples of Data Science applications. | ||
Textbooks:
|
||
References:
|
DA107 Computer System Tools
Course Code: DA107 | Course Name: Computer System Tools | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Basic computer system architecture: Input, Output, Storage, Arithmetic Logic Unit, Control Unit; basics of CPU and GPU; Operating systems: different types and major functions; boot process; file system and partitions; OS installation; File system; Editors; Shell and shell programming; Device drivers; Tools for computer system management under Windows and Unix/Linux environment: Resource Monitoring, Task/process management, System configuration, Storage management, Security and Network settings. Managing user accounts; Software package management. | ||
Textbooks:
|
||
References:
|
DA108 Python Programming
Course Code: DA108 | Course Name: Python Programming | Credits: 2-0-4-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Fundamental concepts: Variables and identifiers, data types, literals, operators, expressions; Conditional statements; Loops; Data structures: Lists, dictionaries and sets; Functions: Procedural and Recursive; Classes; Exception handling; File handling. | ||
Textbooks:
|
||
References:
|
DA109 AI Basics
Course Code: DA109 | Course Name: AI Basics | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Introduction to AI and Intelligent Agents; Problem solving by Searching: Uninformed and informed strategies; Logical Agents: Propositional and first order logic, inference; Knowledge representation and Automated Planning; Uncertain Knowledge and Reasoning: Quantifying uncertainty, probabilistic reasoning; Introduction to Learning: Supervised Learning, Unsupervised Learning, Reinforcement Learning; Applications and Case Studies. | ||
Textbooks:
|
||
References:
|
DA110 Data Structures
Course Code: DA110 | Course Name: Data Structures | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Abstract data types, arrays, stacks, queues, linked lists, binary trees, tree traversals, heaps; Sorting – merge-sort, quicksort, heapsort; Searching - linear search, binary search, binary search trees, AVL trees, red-black trees, B-trees; Graph data structure and representations, breadth first search, depth first search; Hashing. | ||
Textbooks:
|
||
References:
|
DA111 Algorithm Design & Analysis
Course Code: DA111 | Course Name: Algorithm Design & Analysis | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Asymptotic notation, space and time complexity; Sorting and order statistics - linear time sorting, quicksort; Searching; Design and analysis techniques - greedy method, divide-and-conquer, dynamic programming, amortized analysis; Graph algorithms - properties of BFS and DFS, connected components, topological sort, minimum spanning trees, shortest paths, max flow. | ||
Textbooks:
|
||
References:
|
DA112 Introduction to R
Course Code: DA112 | Course Name: Introduction to R | Credits: 2-0-4-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus: Introduction to R: Why R?, Installation of R, RStudio, Cloud Computing, RNW files; Basic Operations in R, R as a calculator; Working with data types and variables; Vector and Matrix operations in R; Relational and Logical Operators; Missing Data Handling; Conditional Statements – if and if-else, nested if, else if, and ifelse, switch and which commands; Loops – for loop, while loop, repeat loop; Functions in R; Sequences, Sorting, Ordering and Mode; Lists and Operations on Lists; Vector Indexing; Factors – Class and Unclass; Strings – Display and Formatting: print and format function, concatenate; Data Frames: Creation and Operations, Combining and Merging; Data Handling: Importing and Reading CSV, Excel data files; Saving and Writing Data Files; Organizing and commenting R code; Data Plotting and Visualization: Scatter plots, bar plots, subdivided bar plots, pie diagrams, histograms; Bivariate and three-dimensional scatter plots; Introduction to basic statistical functions and packages. | ||
Textbooks:
|
||
References:
|
DA201 Relational Database Management Systems
Course Code: DA201 | Course Name: Relational Database Management Systems | Credits: 3-0-3-9 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Relational DBMS: Entity relationship (ER) Model, relational algebras; SQL queries, constraints, triggers; SQL and front-end tools; Storage and file structure: Overview of secondary storage, RAID and flash storage, indexing (tree, hash, and bitmap), implementation of relational operators; Transaction management: ACID properties, concurrency control, crash recovery. Introduction to Databases: Concepts, importance, and types; Relational Data Model: Fundamentals, data integrity, normalization; Entity-Relationship (ER) Model; Storage and File structure; Structured Query Language (SQL); Database Design; Database Administration; Transaction management; Database Connectivity and Application Development; NoSQL Databases Practical Component: Assignments based on theory. |
||
Textbooks:
|
||
References:
|
DA202 Java Programming
Course Code: DA202 | Course Name: Java Programming | Credits: 2-0-4-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Introduction: Overview of Java programming language, Setting up Java Development Environment, Basic syntax and structure of Java programs; Data Types and Variables; Type conversion and casting; Control Flow Statements; Looping statements; Methods and Functions, Object-Oriented Programming (OOP) Concepts; Exception handling; Collections framework; File handling; Data Structures: Arrays and multidimensional arrays, Linked lists, stacks, and queues, Trees and graphs; GUI Programming; Database Connectivity; Multithreading; Networking; Data Science and AI Libraries. |
||
Textbooks:
|
||
References:
|
DA203 Optimization
Course Code: DA203 | Course Name: Optimization | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Introduction: Optimization problems and existence of optimal solutions, convex sets and convex functions; Unconstrained optimization: Basic properties of solutions and algorithms, gradient method, Newton’s method, quasi-Newton method; Linear optimization: Simplex algorithm, duality; Constrained optimization: Equality and inequality constraints, projected gradient method; Convex optimization. |
||
Textbooks:
|
||
References:
|
DA204 Basic Econometrics
Course Code: DA204 | Course Name: Basic Econometrics | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Brief review of random variables: expectation, variance, covariance, estimation and inference in the context of economics; Classical Linear Regression Model: least squares estimation, unbiasedness and efficiency, Gauss-Markov theorem, hypothesis testing, goodness of fit; Nonlinear models, Dummy variables; Heteroscadasticity, Autocorrelation and Multicollinearity: detection, implications and possible remedies; Omitted variable, measurement errors and instrumental variables; Binary response model, Sample selection problem. |
||
Textbooks:
|
||
References:
|
DA205 Data Mining and Warehousing
Course Code: DA205 | Course Name: Data Mining and Warehousing | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Introduction: Definitions, Review of basics of data analysis, applications; Data Warehousing: Definition, architectures, dimensional modeling and star schema, ETL (extract, transform, load) processes; Finding Similar Items: Similarity measures and distance metrics, Shingling and locality-sensitive hashing; Frequent Pattern Mining: Itemset, substring, sequence, pattern evaluation and interestingness measures; Graph Mining: Graph data and representations, link analysis, pattern mining, graph clustering techniques; Mining Data Streams: characteristics of data streams, sliding window models, approximate and sketching techniques, change detection and concept drift. |
||
Textbooks:
|
||
References:
|
DA206 Statistical Inferencing
Course Code: DA206 | Course Name: Statistical Inferencing | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Principles of point estimation; Properties of estimators: unbiasedness, consistency, sufficiency, mean squared errors; Methods of estimation: least squares estimation, method of moments estimators, maximum likelihood estimators (MLEs), statistical properties of MLEs, Fisher information, Cramer Rao Lower Bound; Confidence intervals, Bootstrap percentile method; Testing of hypothesis: Binary hypothesis testing, Type-I and type-II errors, power function, likelihood ratio tests, Neyman-Pearson lemma; Significance testing: general approach, generalized likelihood ratio tests; Bayesian vs Classical statistics, Bayesian inference and posterior distribution, Maximum a Posteriori probability rule, Bayesian least means squares estimation. |
||
Textbooks:
|
||
References:
|
DA207 Signals and Systems
Course Code: DA207 | Course Name: Signals and Systems | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Introduction to Signals: continuous-time, discrete-time signals, properties of signals and various signal operations; Signal representation: signal space and orthogonal bases, Fourier series representation of continuous-time signals, Fourier transform (FT) and its properties; Sampling theorem: aliasing, signal reconstruction, ideal interpolation; Discrete-time FT, Discrete Fourier Transform (DFT) and its properties; System classification and properties; Linear Time Invariant (LTI) systems; Time-domain representation of LTI systems: Impulse response, convolution and response to arbitrary input; Frequency response of LTI systems. |
||
Textbooks:
|
||
References:
|
DA208 Social Media Tools and Techniques
Course Code: DA208 | Course Name: Social Media Tools and Techniques | Credits: 3-0-2-8 |
---|---|---|
Pre-requisite: None | ||
Syllabus:
Fundamentals: Understanding social media contents, frameworks, characteristics, challenges; Mathematical foundations; Data Crawling: Techniques, policies, ethics, responsibilities; Different APIs of crawling data. Text content analysis: Tokenizer, lemmatization; Heaps law, Zipf law; retrieval models, relevance, ranking; Text Embedding; Feature selection, text classification; Network analytics: fundamentals, centrality, link prediction, community detection; network embedding; Applications: Real world applications of social media data. |
||
Textbooks:
|
||
References:
|
DA209 Data Modeling and Visualization
Course Code: DA209 | Course Name: Data Modeling and Visualization | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | Offered in: Trimester-III, Second Year | Type: Compulsory |
Syllabus:
Understanding Structure of Data: Entities, attributes, relationships, data types, data models basics and types, dataset creation and best practices, data preparation for visualization; Introduction to Visual Perception; Guidelines for data visualization; Coordinate system and color scales; Visualizing Amounts; Visualizing Distributions; Visualizing proportions; Visualizing Associations Among Variables; Visualizing Time Series and trends; Visualizing geospatial data and Uncertainty: Projections; Principles of Figure Design; Advanced Topics: Image file formats, visualization softwares, and storytelling using data. |
||
Textbooks:
|
||
References:
|
DA210 Time Series Analysis and Forecasting
Course Code: DA210 | Course Name: Time Series Analysis and Forecasting | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | Offered in: Trimester-III, Second Year | Type: Compulsory |
Syllabus:
Introduction to time series data: Practical examples, Trend in time series data, Parametric trend, LS estimation, Differencing, Nonparametric methods, Trend and noise, Simple time series models, Analysis objectives; Stationary processes: basic properties, linear processes, ARMA(1,1) process, properties, forecasting stationary time series; ARMA(p,q) models: processes, ACF and PACF, spectral densities, periodogram, time-invariant linear filters, spectral density of an ARMA(p,q) process; Modeling and Forecasting with ARMA(p,q) processes: Yule-Walker estimation, Burg’s algorithm, Innovations algorithm, Maximum Likelihood Estimation, Diagnostic checking, Forecasting, Order selection criterion; Nonstationary and seasonal models; Applications and recent developments. |
||
Textbooks:
|
||
References:
|
DA261 Machine Learning Fundamentals
Course Code: DA261 | Course Name: Machine Learning Fundamentals | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | Offered in: Trimester-III, Second Year | Type: Compulsory |
Syllabus:
Introduction: Supervised and unsupervised learning, Generative and discriminative models, Concept of dimensionality and feature vectors, Multi-dimensional Gaussian distribution, Mean vector and covariance matrix in Gaussian distribution; Supervised Learning: Bayesian classification principles, Computation of decision surfaces, Error calculation and performance measures, Risk minimization strategies, Zero-one loss function, Maximum Likelihood Estimation and Maximum A Posteriori Estimation, Bayesian learning concepts, Parzen windows and k-nearest neighbor algorithm, Distance measures and Dynamic Time Warping, Decision trees for classification tasks; Unsupervised Learning: K-means clustering, Hierarchical Agglomerative Clustering, Gaussian Mixture Models, Density-Based Spatial Clustering of Applications with Noise; Dimensionality Reduction Techniques: Curse of dimensionality, Applications of Principal Component Analysis and Fisher Discriminant Analysis to classification problems. |
||
Textbooks:
|
||
References:
|
DA262 Recommender Systems
Course Code: DA262 | Course Name: Recommender Systems | Credits: 3-0-0-6 |
---|---|---|
Pre-requisite: None | Offered in: Trimester-III, Second Year | Type: Compulsory |
Syllabus:
Introduction to Recommender Systems; Traditional Recommendation Techniques: Nearest neighbor-based, associative rule-based, content-based filtering, collaborative filtering; Matrix Factorization Techniques: Introduction, SVD review, alternating least squares, non-negative matrix factorization; Advanced Recommendation Techniques: Context-aware, hybrid, model-based methods; Evaluation metrics and methodologies; Recommender System Challenges: Cold start problem, data sparsity, scalability, privacy, and explainability; Case Studies and Applications: E-commerce, social media, multimedia, and other domains; Ethical Considerations in recommendation systems. |
||
Textbooks:
|
||
References:
|