Avatar

Siddhant H Mantri

AI/ML Enthusiast

siddhantmantri.com

About

I'm a tech enthusiast with a strong desire to learn and grow. I've self-taught various programming languages and technologies, and I'm always on the lookout for opportunities to advance my abilities. I'm an enthusiastic, people-oriented individual who enjoys expanding my network and collaborating with like-minded professionals. I thrive on the excitement of learning and look forward to contributing my passion for technology to any team or project.

Work Experience

Jan - Jun 2025

ML Intern

Sudha Gopalakrishnan Brain Centre - IIT Madras

Trained and fine-tuned YOLOv9 on a curated brain segmentation dataset (25 target sections), achieving a Dice score of 0.97 through a novel post-inference processing technique to address YOLO limitations. Worked on developing an innovative neural network model extending YOLOv9, enabling end-to-end brain region segmentation without post-inference steps. Designed and implemented robust data preprocessing and annotation pipelines, ensuring high-quality inputs for model training and evaluation.

Sep 2024 - Present

Research Intern

CFITL - Indian Institute of Technology Bombay

Created a 1.25M query-response pair dataset from Hindi Wordnet, collaborating with linguists to design an advanced data processing pipeline. Fine-tuned Gemma3-12B (LoRA) for an educational Hindi language chatbot, leveraging prompt engineering and advanced post-processing for improved student support across proficiency levels. Developed a context-aware keyword detection service using conversation history, integrating statistical, pattern-based NLP, and Stanza-based components for enhanced contextual understanding.

Jun - Sep 2024

ML Intern

Volkswagen Group Technology Solutions India

As part of Project AISHA 2.0, I contributed to multiple stages of developing an in-house, Retrieval-Augmented Generation (RAG)-based chatbot built from scratch. My key responsibilities included research on indexing and retrieval process which included comparison of various vector data stores to optimize the application. Additionally, I developed an enhancement service that significantly improved the chatbot's performance, reducing resource usage while accelerating response times. I also worked on Large Action Model development.

View Letter

Education

2021 - 2025

B.Tech Computer Science and Engineering (Cyber Security)

Mukesh Patel School of Technology Management and Engineering, NMIMS, Mumbai

CGPA: 3.86/4
2021 - 2025

B.S Data Science and Applications

Indian Institute of Technology Madras, Chennai

CGPA: 7.39/10
Project CGPA: 9.0/10

Publications

Dec 2024

Architectural Framework for Automated Incident Response: Leveraging LLMs And Classifiers for Rapid Post-Attack Analysis and Reporting

ICTCS-2024 (Pub. Springer)

Siddhant Hitesh Mantri, Veer Mehta, Aryamann Khare, Dr. Pintu R Shah

View Paper

Projects

Aug - Nov 2024

Automated Incident Response: Leveraging LLMs for Rapid Post-Attack Analysis and Reporting

Developed an AI-driven automated incident response framework integrating on-device Large Language Models (LLMs) and specialized classifiers to streamline post-attack analysis, reducing response times from days to hours. Architected a system for real-time log ingestion, multi-source data correlation, and comprehensive report generation, enabling rapid, accurate threat detection and response across networked environments.

Cybersecurity • Incident Response • Automation • Large Language Models (LLMs) • Machine Learning • Python • Flask • Wazuh • Suricata • SMTP • Log Analysis • Hugging Face • API Integration • Machine Learning • Log Analysis

View Project
Jun - Aug 2024

AI-Powered Learning Management System (LMS)

Contributed to the design and implementation of AI-driven features, including course summaries, peer-driven insights, and coding assistance, leveraging advanced language models such as Mistral8x7b and LLaMa3-70B. Integrated the vLLM inference engine to optimize performance, reducing latency and efficiently scaling AI-generated responses across the system. Implemented FastAPI to serve multiple endpoints for AI-powered features enhancing the overall learning experience for students.

Large Language Models • FastAPI • vLLM • Python • OpenAI

View Project
Feb - Apr 2024

Recipe for Rating: Predict Food Ratings using ML

Developed machine learning models to accurately predict food ratings based on recipe information and user reviews. Preprocessed and engineered features from a dataset containing multiple attributes. Evaluated multiple algorithms including logistic regression, boosting techniques like AdaBoost, and advanced algorithms like Multi Layer Perceptron. Achieved an accuracy of 77.166% by implementing a logistic regression model.

Python • Machine Learning Algorithms • Pandas • Numpy • Scikit-learn

View Project

Connect

Github @siddhant-192
LinkedIn @siddhant-mantri
Scroll to top