Advancing the frontiers of Computer Vision and Deep Learning
My research focuses on developing robust and efficient deep learning models for computer vision applications. I am particularly interested in self-supervised learning approaches that can learn meaningful representations from unlabeled data, multimodal learning that combines visual and textual information, and adversarial robustness to create more reliable AI systems.
Working under the guidance of Prof. Arijit Sur and Dr. Pinaki Mitra at the MultiMedia Lab, IIT Guwahati, I explore novel approaches to address fundamental challenges in machine learning and computer vision.
Advancing multimodal learning techniques that effectively combine and understand information from multiple modalities including vision, language, and audio.
Developing autonomous AI agents capable of reasoning, planning, and taking actions in complex environments with minimal human intervention.
Exploring retrieval-augmented generation systems that combine knowledge retrieval with generative models for more accurate and contextual responses.
Advancing visual question answering systems that can understand and reason about visual content to provide accurate answers to natural language questions.
Developing models that can recognize and understand new concepts without explicit training examples, leveraging semantic knowledge and transfer learning.