
Siddhant H Mantri
AI/ML Enthusiast
siddhantmantri.com
About
I'm a tech enthusiast with a strong desire to learn and grow. I've self-taught various programming languages and technologies, and I'm always on the lookout for opportunities to advance my abilities. I'm an enthusiastic, people-oriented individual who enjoys expanding my network and collaborating with like-minded professionals. I thrive on the excitement of learning and look forward to contributing my passion for technology to any team or project.
Work Experience
ML Intern
Sudha Gopalakrishnan Brain Centre - IIT Madras
Trained and fine-tuned YOLOv9 on a curated brain segmentation dataset (25 target sections), achieving a Dice score of 0.97 through a novel post-inference processing technique to address YOLO limitations. Worked on developing an innovative neural network model extending YOLOv9, enabling end-to-end brain region segmentation without post-inference steps. Designed and implemented robust data preprocessing and annotation pipelines, ensuring high-quality inputs for model training and evaluation.
Research Intern
CFITL - Indian Institute of Technology Bombay
Created a 1.25M query-response pair dataset from Hindi Wordnet, collaborating with linguists to design an advanced data processing pipeline. Fine-tuned Gemma3-12B (LoRA) for an educational Hindi language chatbot, leveraging prompt engineering and advanced post-processing for improved student support across proficiency levels. Developed a context-aware keyword detection service using conversation history, integrating statistical, pattern-based NLP, and Stanza-based components for enhanced contextual understanding.
ML Intern
Volkswagen Group Technology Solutions India
As part of Project AISHA 2.0, I contributed to multiple stages of developing an in-house, Retrieval-Augmented Generation (RAG)-based chatbot built from scratch. My key responsibilities included research on indexing and retrieval process which included comparison of various vector data stores to optimize the application. Additionally, I developed an enhancement service that significantly improved the chatbot's performance, reducing resource usage while accelerating response times. I also worked on Large Action Model development.
View LetterEducation
B.Tech Computer Science and Engineering (Cyber Security)
Mukesh Patel School of Technology Management and Engineering, NMIMS, Mumbai
CGPA: 3.86/4
B.S Data Science and Applications
Indian Institute of Technology Madras, Chennai
CGPA: 7.39/10 Project CGPA: 9.0/10
Publications
Architectural Framework for Automated Incident Response: Leveraging LLMs And Classifiers for Rapid Post-Attack Analysis and Reporting
ICTCS-2024 (Pub. Springer)
View PaperProjects
Automated Incident Response: Leveraging LLMs for Rapid Post-Attack Analysis and Reporting
Developed an AI-driven automated incident response framework integrating on-device Large Language Models (LLMs) and specialized classifiers to streamline post-attack analysis, reducing response times from days to hours. Architected a system for real-time log ingestion, multi-source data correlation, and comprehensive report generation, enabling rapid, accurate threat detection and response across networked environments.
Cybersecurity • Incident Response • Automation • Large Language Models (LLMs) • Machine Learning • Python • Flask • Wazuh • Suricata • SMTP • Log Analysis • Hugging Face • API Integration • Machine Learning • Log Analysis
View ProjectAI-Powered Learning Management System (LMS)
Contributed to the design and implementation of AI-driven features, including course summaries, peer-driven insights, and coding assistance, leveraging advanced language models such as Mistral8x7b and LLaMa3-70B. Integrated the vLLM inference engine to optimize performance, reducing latency and efficiently scaling AI-generated responses across the system. Implemented FastAPI to serve multiple endpoints for AI-powered features enhancing the overall learning experience for students.
Large Language Models • FastAPI • vLLM • Python • OpenAI
View ProjectRecipe for Rating: Predict Food Ratings using ML
Developed machine learning models to accurately predict food ratings based on recipe information and user reviews. Preprocessed and engineered features from a dataset containing multiple attributes. Evaluated multiple algorithms including logistic regression, boosting techniques like AdaBoost, and advanced algorithms like Multi Layer Perceptron. Achieved an accuracy of 77.166% by implementing a logistic regression model.
Python • Machine Learning Algorithms • Pandas • Numpy • Scikit-learn
View ProjectConnect


