Research

Research Focus

My research focuses on developing robust and efficient deep learning models for computer vision applications. I am particularly interested in self-supervised learning approaches that can learn meaningful representations from unlabeled data, multimodal learning that combines visual and textual information, and adversarial robustness to create more reliable AI systems.

Working under the guidance of Prof. Arijit Sur and Dr. Pinaki Mitra at the MultiMedia Lab, IIT Guwahati, I explore novel approaches to address fundamental challenges in machine learning and computer vision.

Research Areas

Computer Vision

Developing algorithms for image analysis, object detection, visual understanding, and scene interpretation. Focus on robust and efficient vision systems that can work in real-world scenarios.

Object Detection Image Analysis Visual Understanding Scene Recognition

Deep Learning

Advancing neural network architectures, optimization techniques, and representation learning methods. Developing efficient and robust deep learning models for various applications.

Neural Networks Representation Learning Model Optimization Architecture Design

Multimodal Learning

Developing models that can understand and process information from multiple modalities such as vision and language. Focus on cross-modal understanding and joint representation learning.

Vision-Language Cross-modal Learning Multimodal Fusion Joint Embeddings

Self-Supervised Learning

Exploring methods to learn meaningful representations without labeled data. Investigating contrastive and predictive approaches for learning from unlabeled datasets.

Contrastive Learning Unsupervised Learning Representation Learning Pretext Tasks

Current Research Projects

Universal Adversarial Suffixes for Language Models Using Reinforcement Learning with Calibrated Reward

Preprint

This work presents a novel reinforcement learning approach for generating universal adversarial suffixes for language models using calibrated reward mechanisms. The method demonstrates effectiveness in creating adversarial attacks while maintaining computational efficiency and provides insights into language model vulnerabilities.

Key Contributions:

Reinforcement learning framework with calibrated rewards
Efficient generation of universal adversarial suffixes
Comprehensive evaluation on state-of-the-art language models
Analysis of model robustness and vulnerability patterns

arXiv Preprint

Universal Adversarial Suffixes Using Calibrated Gumbel-Softmax Relaxation

Preprint

This work introduces a calibrated Gumbel-Softmax relaxation technique for generating universal adversarial suffixes. The proposed method provides a differentiable approach to discrete token optimization in language model attacks, enabling effective gradient-based optimization.

Key Contributions:

Novel Gumbel-Softmax relaxation with calibration mechanism
Differentiable optimization for discrete token generation
Improved attack success rates on multiple benchmarks
Theoretical analysis of the relaxation approach

arXiv Preprint

C-LEAD: Contrastive Learning for Enhanced Adversarial Defense

Preprint

Under Review

This work introduces a novel contrastive learning framework for enhancing adversarial robustness in deep neural networks. The approach leverages contrastive learning principles to learn robust feature representations that are less susceptible to adversarial perturbations while maintaining competitive performance on clean data.

Key Contributions:

Novel contrastive learning framework for adversarial defense
Theoretical analysis of robustness properties
Comprehensive evaluation on multiple benchmark datasets
Improved trade-off between clean accuracy and robustness

arXiv Preprint

Multi-source Transfer Learning with Self-Supervised Learning

In Progress

Investigating novel approaches to combine multiple source domains for transfer learning using self-supervised learning techniques. The goal is to develop methods that can effectively leverage diverse source domains to improve performance on target domains with limited labeled data.

Research Objectives:

Develop multi-source domain adaptation algorithms
Integrate self-supervised learning for better representations
Handle domain shift and distribution mismatch
Evaluate on computer vision benchmarks

Manuscript in Preparation

Future Research Directions

Emerging Research Areas

Multimodal Deep Learning

Advancing multimodal learning techniques that effectively combine and understand information from multiple modalities including vision, language, and audio.

Agentic AI

Developing autonomous AI agents capable of reasoning, planning, and taking actions in complex environments with minimal human intervention.

RAG (Retrieval-Augmented Generation)

Exploring retrieval-augmented generation systems that combine knowledge retrieval with generative models for more accurate and contextual responses.

VQA (Visual Question Answering)

Advancing visual question answering systems that can understand and reason about visual content to provide accurate answers to natural language questions.

Zero-shot Learning

Developing models that can recognize and understand new concepts without explicit training examples, leveraging semantic knowledge and transfer learning.

Discuss Research Collaboration

Research Focus

Research Areas

Current Research Projects

Key Contributions:

Key Contributions:

Key Contributions:

Research Objectives:

Research Tools & Technologies

Deep Learning Frameworks

Programming Languages

Computer Vision Libraries

Data Analysis & Visualization

High-Performance Computing

Experiment Management

Future Research Directions

Emerging Research Areas

Multimodal Deep Learning

Agentic AI

RAG (Retrieval-Augmented Generation)

VQA (Visual Question Answering)

Zero-shot Learning

Research

Research Focus

Research Areas

Current Research Projects

Key Contributions:

Key Contributions:

Key Contributions:

Research Objectives:

Research Tools & Technologies

Deep Learning Frameworks

Programming Languages

Computer Vision Libraries

Data Analysis & Visualization

High-Performance Computing

Experiment Management

Future Research Directions

Emerging Research Areas

Multimodal Deep Learning

Agentic AI

RAG (Retrieval-Augmented Generation)

VQA (Visual Question Answering)

Zero-shot Learning

Citation