Hi, Pranav here!

I'm

About

I am a data professional with experience in exploring data and developing data-driven end-to-end solutions. I have experience in Natural Language Processing, Computer Vision, Audio Processing, Knowledge Graphs, and creating Data Pipelines. My passion lies in leveraging these skills to solve complex problems and drive innovation in a variety of domains.


Skills

Programming: Python, C++, Scala (elementary), SQL
Machine & Deep Learning: NLP, Computer Vision, Geometric Deep Learning
Data Manipulation & Analysis: Data Visualization, Data Cleaning, Feature Engineering, Feature Selection, Feature Extraction
MLOps & Version Control: Docker, Kubernetes, APIs, CI/CD, DVC, MLflow, Git
Databases: SQL, NoSQL, Graph database
Data Engineering tools: ETL, PySpark, Apache Kafka, Airflow, Shell scripting
Cloud Platforms: AWS, Azure, GCP
Languages: English (Fluent), German (B1), Marathi, Hindi

Resume

Professional Experience

Data Scientist

Feb 2024 - Present

paiqo GmbH, Paderborn, Germany

Data Scientist

Mar 2022 - Jan 2024

BASF Digital Solutions GmbH, Ludwigshafen, Germany

  • Developed an app for summary extraction from unstructured documents with 85 % accuracy
  • Implemented a document-based search application using LLMs (OpenAI API) and LangChain to enhance search capabilities
  • Developed deep learning model for handwriting recognition for contractor timesheets, to automate and accelerate data input
  • Implemented machine learning models and RegEx to extract company names from PDFs
  • Refined and processed data for over 6000 products, preparing them for integration into knowledge graphs
  • Collaborated on the development of table extraction model using LayoutLMV2
  • Developed structured information extraction pipelines for multiple data sources i.e. PDF, web-pages etc. using OCR and web-scrapping
  • Secured second place in an internal AI-based hackathon
  • Led customer demos

Data Science Intern

Oct 2021 - Feb 2022

Chemovator GmbH, Mannheim, Germany

  • Developed cognitive search for semi-structured and unstructured data utilizing transformers and knowledge graphs (Medium Article)
  • Collaborated on development of chatbot for chemical and pharma domain
  • Implemented an entity-based cognitive search using knowledge graphs
  • Customer demos and rollouts

Freelancer

2017 - 2018

Pune, India

  • Developed and implemented advanced computer vision algorithms for client-specific needs
  • Developed and automated the ETL process

Project Engineer

2014 - 2016

Godrej and Boyce, Mumbai, India

  • Led the automation of industrial machines
  • Managed project scheduling and predictive maintenance tasks for timely project completion and reduced machine downtime

Education

Master of Science

Nov 2023

Universtiy of Paderborn, Paderborn, Germany

Thesis: Unuspervised Graph Neural Networks-based Shape Retrieval using Invariant Contour Features

Bachelor of Engineering

2014

University of Pune, Pune, India

Projects

  • Links to github

APS Failure Prediction

A predictive system for heavy-duty vehicles to predict failure due to APS, boosting safety and repair cost.

Detection and Extraction of Number Plates

An automated vehicle number plate recognition system using Inception-ResNet-v2 for plate detection and Paddle OCR for character extraction, designed for applications in traffic monitoring, parking management, toll plazas, and security surveillance.

Book Recommander System

A collaborative filtering-based book recommendation system implemented in Streamlit, designed to suggest similar books tailored to individual user interests.

Content Based Image Retrieval

An advanced reverse image search system that collects, trains, and predicts related images using FastAPI, MongoDB, AWS S3, and various Python libraries, streamlining the process of retrieving relevant image results from an input query.

Prediction of Consumer Grievances

An NLP-driven solution that determines if a consumer will file a dispute after a company's response, leveraging data from the Consumer Financial Protection Bureau (CFPB) and employing CatBoost for classification, enabling companies to prioritize responses and efficiently manage potential disputes.

Authentication App

A two-factor authentication system that integrates password-based security with FaceNet-driven face recognition and MTCNN-based face detection, offering enhanced digital security and mitigating vulnerabilities associated with traditional password systems.

Real-Time Safety Detection

A computer vision-based Industry Safety Detection (ISD) system utilizing YOLOv7 to monitor and ensure employees wear essential safety gear, including helmets, gloves, jackets, goggles, and footwear, before entering the workplace.

Language Identification App

An application designed to identify spoken content in four Indian languages (Hindi, Kannada, Tamil, and Telugu) from YouTube audio extracts, utilizing a custom CNN model trained on Mel Spectrogram images.

News Summerization App

An NLP solution that employs the t5-small transformer model from Hugging Face, fine-tuned on a dataset of news articles, to generate extractive summaries capturing the main points of news articles, offering a quick overview for readers with time constraints.

Defect Detection in Steel Manufacturing

A defect detection system for steel sheets using Xception CNN and Vision Transformers for classification, paired with ResUNet models for precise localization of four distinct surface defects, outputting images with segmentation masks.

Quora Insincere Question Classification

A sentiment analysis system designed to identify and flag insincere questions on Quora using various models, including Random Forest, LSTM with Google embeddings, and BERT, with comprehensive exploratory data analysis and model evaluations presented in dedicated notebooks.

Stack Overflow Question Classification

A classification system that differentiates between Python and non-Python questions on StackOverflow, utilizing Data Version Control (DVC) for the process.

Estimation of Shipment Price

An application that leverages machine learning to predict shipment costs based on factors like package weight, dimensions, distance, transportation mode, and special handling requirements, enabling logistics companies to set dynamic pricing and optimize their services.

Face Swap

A computer vision application that swaps the faces of individuals in images, leveraging deep learning models for accurate face detection and replacement, enhancing entertainment and creative content generation.