Gaurav Sharma psykick-21

Hi 👋, I'm Gaurav Sharma

Data Scientist passionate about uncovering hidden patterns in data. Adept at building and deploying machine learning and deep learning models, and data pipelines for real-world applications. Proficient in Python for data manipulation and analysis. Eager to leverage data science to solve challenging problems.

🔆 Highlights

Multi-Utility LLM Application

Developed a multi-utility application powered by LLMs and Langchain framework primarily

Question Answering: Built an interface for users to ask general questions and receive answers. Users can choose between various LLMs like gpt-3.5-turbo, llama3-8b-instruct , gemma-7b-it and Mistral-7B-Instruct-v0.2 using OpenAI API, Ollama, Groq API and HuggingFace respectively, for answering with Langchain's tools for building the pipeline.
Website Search: Created an interface to search websites like Wikipedia, Langsmith, and Arxiv by posing questions. Specialized Langchain agents and tools handle information lookup and context generation for each website, leveraging LLM power for delivering responses.
RAG App: Created a RAG chat app by combining document parsers, text splitter and a vector store and prompt into a chain, where the user can upload documents and chat with them. 📂 Head over to the repo to read about this project in detail

Text Summarization API

Built a Text-summarization API using HuggingFace transformer (Google Pegasus), train it on Samsum data from HuggingFace, build a training and inference pipeline using FAST API and deployed to AWS with CI/CD Pipeline.
📹 Watch a demonstration video: here
📂 Visit the repo: here

Ninjacart Image Classification

Trained an image classification model (CNN) using Tensorflow from scratch and used pre-trained models and fine-tuned them for the required use case. Used Optuna to hyperparameter tune the models and select the best performing one to infer on the test dataset.
📂 Visit the repo: here

Customer Churn Prediction

Developed a machine learning model to predict customer churn. Utilized various classification algorithms including Logistic Regression, KNN, SVM, Decision Tree, Random Forest, XGBoost, LightGBM, AdaBoost, CatBoost and Stacking Ensemble, achieving 91.6% accuracy and 0.90 precision in identifying at-risk customers.
📂 Visit the repo: here

Porter Regression

Built a delivery time prediction model for Porter using regression techniques. Data preprocessing included handling missing values and outliers, along with feature engineering and standardization. Experimented with various models like Linear Regression, Decision Tree, XGBoost, AdaBoost, CatBoost, LightGBM, Random Forest and Neural Networks. LightGBM Regressor achieved the best performance with a minimum mean squared error of 0.653.
📂 Visit the repo: here

🛠️ Languages, tools and skillset:

Languages: Python, SQL
Concepts: Data Analysis, Probability and Statistics, Machine Learning, Deep Learning, Unsupervised learning, Feature Engineering, MLOps
Tools and softwares: Tableau, Postman, Docker, Git
Libraries, utilities and frameworks: Numpy, Pandas, Scikit-Learn, Matplotlib, Seaborn, Tensorflow, Keras, Pyspark, Snowflake, MongoDB, ChromaDB

👨‍💻 All of my projects are available in the Repositories
📫 Reach me at [email protected]
📙 Vist my Medium blog here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gaurav Sharma psykick-21

Achievements

Achievements

Highlights

Block or report psykick-21