About
I am an AI researcher and robotics engineer with a focus on computer vision, natural language processing, and autonomous systems. My research interests lie at the intersection of deep learning, robotics, and practical applications, with emphasis on building efficient and deployable AI solutions.
Currently, I am exploring multimodal learning, document understanding systems, robot perception, and edge deployment of ML models. I am actively seeking research opportunities and collaborations in these areas.
Research Interests
News
- Jan 2025 Launched InvoiceVision โ invoice extraction using Qwen 2.5 VL
- Dec 2024 Released DocShield for automatic PII redaction
- Jul 2025 Graduated from RUET with B.Sc. in Mechatronics Engineering
- Feb 2022 Completed Data Science internship at The Sparks Foundation
Selected Projects
Streamlit app for extracting structured data from invoice images using Qwen 2.5 VL via OpenRouter API.
Automated PII detection and redaction for PDF documents using OCR and pattern recognition.
Attention-driven text-to-image synthesis using AttnGAN architecture with word-level attention.
Real-time vehicle detection using YOLOv8, optimized for Raspberry Pi deployment.
SVM-based classification with hyperparameter optimization using GridSearchCV.
Regression model using feature engineering for accurate house price predictions.
Experience
- Developed predictive models for house price prediction using feature engineering
- Built SVM classification pipeline for breast cancer detection with hyperparameter tuning
- Coordinated technical workshops and robotics competitions for 500+ students
- Facilitated knowledge sharing sessions on embedded systems and robotics
Education
CGPA: 3.17 / 4.00 (Last 2 Semesters: 3.65/4.00)
Technical Skills
Languages
Python, C++, SQL, JavaScript
ML/DL Frameworks
PyTorch, TensorFlow, Scikit-learn, Hugging Face, LangChain
Robotics
ROS2, SLAM, Robot Perception, Gazebo, Nav Stack
Tools & Platforms
Docker, FastAPI, Git, Linux, Streamlit, OpenCV
Computer Vision
Object Detection, Segmentation, OCR, YOLO, CNNs, Transformers
NLP & LLMs
RAG, Prompt Engineering, Generation