Advancing the frontiers of Computer Vision and Deep Learning
My research focuses on developing robust and efficient deep learning models for computer vision applications. I am particularly interested in self-supervised learning approaches that can learn meaningful representations from unlabeled data, multimodal learning that combines visual and textual information, and adversarial robustness to create more reliable AI systems.
Working under the guidance of Prof. Arijit Sur and Dr. Pinaki Mitra at the MultiMedia Lab, IIT Guwahati, I explore novel approaches to address fundamental challenges in machine learning and computer vision.
Systematic literature review and gap analysis to identify challenging problems in computer vision and machine learning
Design and implementation of novel algorithms with theoretical foundations and practical considerations
Comprehensive experiments on standard benchmarks with rigorous statistical analysis and comparison
Dissemination of research findings through peer-reviewed publications and open-source implementations
Advancing multimodal learning techniques that effectively combine and understand information from multiple modalities including vision, language, and audio.
Developing autonomous AI agents capable of reasoning, planning, and taking actions in complex environments with minimal human intervention.
Exploring retrieval-augmented generation systems that combine knowledge retrieval with generative models for more accurate and contextual responses.
Advancing visual question answering systems that can understand and reason about visual content to provide accurate answers to natural language questions.
Developing models that can recognize and understand new concepts without explicit training examples, leveraging semantic knowledge and transfer learning.