
Siddhant H Mantri
AI/ML Enthusiast
siddhantmantri.com
About
I'm a tech enthusiast with a strong desire to learn and grow. I've self-taught various programming languages and technologies, and I'm always on the lookout for opportunities to advance my abilities. I'm an enthusiastic, people-oriented individual who enjoys expanding my network and collaborating with like-minded professionals. I thrive on the excitement of learning and look forward to contributing my passion for technology to any team or project.
Work Experience
ML Intern
Sudha Gopalakrishnan Brain Centre - IIT Madras
Trained and fine-tuned YOLOv9 on a curated brain segmentation dataset (25 target sections), achieving a Dice score of 0.97 and 30% inference speed improvement through a novel post-inference processing technique to address YOLO limitations. Worked on developing an innovative neural network model extending YOLOv9, enabling end-to-end brain region segmentation without post-inference steps. Designed and implemented robust data preprocessing and annotation pipelines, ensuring high-quality inputs for model training and evaluation.
View LetterResearch Intern
CFITL - Indian Institute of Technology Bombay
Engineered a structured-data pipeline that transformed Hindi WordNet into a large-scale dataset of 1.25M instruction-response pairs, significantly boosting pedagogical effectiveness for conversational AI systems. Fine-tuned the Gemma3-12B model using LoRA with 4-bit quantization, achieving a 58% improvement in response consistency, sub-second inference latency, and a 75% reduction in memory usage. Built and deployed a context-aware keyword detection microservice with FastAPI, enhancing low-resource language processing by improving contextual understanding accuracy by 25%.
ML Intern
Volkswagen Group Technology Solutions India
As part of Project AISHA 2.0, I contributed to multiple stages of developing an in-house, Retrieval-Augmented Generation (RAG)-based chatbot built from scratch. My key responsibilities included research on indexing and retrieval process which included comparison of various vector data stores to optimize the application. Additionally, I developed an enhancement service that significantly improved the chatbot's performance, reducing resource usage while accelerating response times. I also worked on Large Action Model development.
View LetterEducation
M.S Computer Science and Engineering
University of California, San Diego (UCSD)
CGPA: -/4
B.Tech Computer Science and Engineering (Cyber Security)
Mukesh Patel School of Technology Management and Engineering, NMIMS, Mumbai
CGPA: 3.86/4
B.S Data Science and Applications
Indian Institute of Technology Madras, Chennai
CGPA: 7.39/10 Project CGPA: 9.0/10
Publications
From Lexicon to AI: A Structured-Data Pipeline for Specialized Conversational Systems in Low-Resource Languages
IJCNLP-AACL 2025 (Under Review)
View PaperArchitectural Framework for Automated Incident Response: Leveraging LLMs And Classifiers for Rapid Post-Attack Analysis and Reporting
ICTCS-2024 (Pub. Springer)
View PaperProjects
Automated Incident Response: Leveraging LLMs for Rapid Post-Attack Analysis and Reporting
Developed an AI-driven automated incident response framework integrating on-device Large Language Models (LLMs) and specialized classifiers to streamline post-attack analysis, reducing response times from days to hours. Architected a system for real-time log ingestion, multi-source data correlation, and comprehensive report generation, enabling rapid, accurate threat detection and response across networked environments.
Cybersecurity • Incident Response • Automation • Large Language Models (LLMs) • Machine Learning • Python • Flask • Wazuh • Suricata • SMTP • Log Analysis • Hugging Face • API Integration • Machine Learning • Log Analysis
View ProjectAI-Powered Learning Management System (LMS)
Contributed to the design and leading development of AI-powered features—including course summaries, peer-driven insights, and coding assistance—by leveraging advanced language models such as LLaMa3-70B. Integrated the vLLM inference engine to improve the efficiency and performance of LLM queries, reducing latency and enabling scalable AI-generated responses across the system. Developed and deployed multiple AI-driven endpoints using FastAPI, with version control through Git and agile management under Scrum methodology. Ensured robustness and reliability of application components by writing unit tests and performing debugging with pytest, ultimately enhancing the overall learning experience for students.
Large Language Models • FastAPI • vLLM • Python • OpenAI
View ProjectRecipe for Rating: Predict Food Ratings using ML
Developed machine learning models to accurately predict food ratings based on recipe information and user reviews. Preprocessed and engineered features from a dataset containing multiple attributes. Evaluated multiple algorithms including logistic regression, boosting techniques like AdaBoost, and advanced algorithms like Multi Layer Perceptron. Achieved an accuracy of 77.166% by implementing a logistic regression model.
Python • Machine Learning Algorithms • Pandas • Numpy • Scikit-learn
View ProjectConnect


