Senior Research Fellow • IIT Guwahati

Rahul Goswami

GATE Statistics

AIR 63

IIT-JAM

AIR 306

Top Rated Plus

Upwork

NPTEL Top 1%

Operations Research

Specializing in Machine Learning Techniques with a focus on statistical reasoning of Black Box Algorithms. Advancing research in Statistical Machine Learning, Ensemble Techniques, and Survival Analysis.

PhD Candidate ML Researcher Statistical Analysis Survival Analysis
Professional Photo

Academic Affiliations

IIT Guwahati

PhD Student

Sorbonne Abu Dhabi

SAFIR Affiliate

Institute of Actuaries, India

Student Member

About Me

Senior Research Fellow at IIT Guwahati with expertise in Machine Learning and Statistical Analysis

Research Focus

Statistical reasoning of Black Box Algorithms

Statistical Machine Learning and Ensemble Techniques

Bayesian Inference and Anomaly Detection

Survival Analysis and Default Transaction Detection

Recent Publications

Published
Shape Penalized Decision Forests

IEEE • Imbalanced Data Classification

Accepted
MART: Moving Average Randomized Tree

Springer Machine Learning

Current Pursuits

MITx Micromaster in Finance

Complementing technical expertise with financial knowledge

Progress 2/3 Courses Complete

StatML Blog

Explore my latest insights on Statistical Machine Learning

statml.in

Visit Blog

Personal Motto

"Pasos Cortas, Vista Larga"

Short Steps, Long View

Education

Academic journey and achievements

2024

MITx Micromaster Programme

Micromaster in Finance Course

2024 - Current

In Progress
  • • Cleared Two Courses: Foundations of Modern Finance I & II
  • • Currently Pursuing Financial Reporting Course
2021

Indian Institute of Technology, Guwahati

Senior Research Fellow • Department of Mathematics

2021 - Current

GPA: 8.76/10

Coursework (42 credits):

Probability Theory Real Analysis Linear Algebra Advanced Topics in ML Advanced Statistical Algorithm
Course Repository
2018

Banaras Hindu University

M.Sc. Statistics and Computing

2018 - 2020

GPA: 8.4/10

Center for Interdisciplinary Mathematical Sciences (84 credits)

Key courses: Bayesian statistics, computational statistics, regression, probability, and survival analysis

2015

Deen Dayal Upadhyaya Gorakhpur University

B.Sc. Statistics and Computer Science

2015 - 2018

First Division

Completed CS1 Certification Course by the Institute of Actuaries India

Professional Experience

Research, industry collaboration, and technical expertise

Math Expert

ByteDance

March 2025 - Current

Current

Quality Reviewer of PhD Level Math Questions

AI Trainer

Turing

August 2024 - February 2025

  • • Helping top companies to build AGI
  • • PhD. Math Expert

Research Intern

Adobe

May 2024 - August 2024

  • • Research on memorization, reasoning, and counting capabilities of LLMs
  • • Developed quantitative metrics for LLM performance evaluation
  • • Guided redesign and optimization of Adobe's in-house LLMs

NLP Expert

AI2 (Freelance)

March 2024 - May 2024

  • • Natural language processing on historical research papers
  • • Applied sophisticated NLP methods for information extraction
  • • Made historical research more accessible

Research Collaborator

Huntington's Disease Project

December 2022 - Present

Ongoing
  • • Collaboration with Iowa University researchers
  • • Applied Linear Mixed-Effect models for data analysis
  • • Contributing to Huntington's Disease research

Machine Learning Engineer

Ready Tensor

March 2021 - December 2021

  • • Implemented advanced ML models: TSMixer, PatchMixer
  • • Developed CatBoost, AdaBoost, and XGBoost implementations
  • • Focus on time-series forecasting and classification

Research & Publications

Contributions to machine learning and statistical analysis

Publications

Published

Shape Penalized Decision Forests for Imbalanced Data Classification

Classification trees often yield fragmented minority boundaries under imbalanced data. Proposed SVR regularization that penalizes decision-set complexity.

Accepted

MART: Moving Average Randomized Tree

Springer Machine Learning • A randomized CAGR-based split method for predicting future trends in the stock market.

Under Review

Concordance-based Survival Cobra with Regression Type Weak Learners

Novel survival analysis method using concordance-based techniques combined with regression type weak learners.

Software Packages

imbalanced-spdf

Shape Penalized Decision Forests for training ensemble classifiers tailored for imbalanced datasets.

SPBoDF SPBaDF Imbalanced Data

cobsurv

Combined Regression Strategy Survival - Product of PBSA (Proximity-Based Survival Analysis) research project.

Survival Analysis PBSA

fastkme

A faster Kaplan-Meier Estimator for working with nonparametric estimator in survival analysis.

Kaplan-Meier Performance

Technical Skills

Expertise across programming languages, frameworks, and methodologies

Programming Languages

Python Expert
R Expert
Julia Intermediate
Lean Intermediate

ML Frameworks

PyTorch Expert
TensorFlow Expert
scikit-learn Proficient
Hugging Face Skilled

Research Areas

Statistical ML

Ensemble Techniques

Bayesian Inference

Anomaly Detection

Survival Analysis

Transaction Detection

Black Box Algorithms

Statistical Reasoning

Tools & Technologies

Docker

Git

OpenAI GPT

LLaMA

ggplot2

dplyr

Get in Touch

Let's collaborate on research or discuss opportunities

Contact Information

Location

Guwahati, Assam, India

Connect on Social Media

Send a Message