Oladapo

Data science Project

View the Project on GitHub Oladapoduk/AI-Lab.io

Projects

Kidney-Disease-Classification-Deep-Learning-Project

Github “Kidney Disease Classification with MLFlow” is a machine learning project focused on the development and deployment of a predictive model for diagnosing kidney diseases. MLFlow, a popular open-source platform for managing machine learning workflows, is utilized to streamline the model development process. This project leverages a dataset of kidney disease-related features to train and evaluate machine learning algorithms, ultimately creating a robust classification model. MLFlow’s capabilities are harnessed for efficient model tracking, experimentation, and deployment, ensuring reproducibility and scalability in the development and deployment of this vital medical tool.

Sensor-Fault-Detection Project

Github In this project, the system in focus is the Air Pressure system (APS) which generates pressurized air that are utilized in various functions in a truck, such as braking and gear changes. The datasets positive class corresponds to component failures for a specific component of the APS system. The negative class corresponds to trucks with failures for components not related to the APS system. The problem is to reduce the cost due to unnecessary repairs. So it is required to minimize the false predictions.

Sentiment analysis using Natural Language processing MLOPs/AIOPs-Project

Github “NLP Classification with MLFlow and DVC” is an innovative project that combines the power of natural language processing (NLP) and machine learning for text classification tasks. MLFlow, a versatile machine learning lifecycle management tool, is used to orchestrate the development and deployment of NLP models. Additionally, Data Version Control (DVC) is integrated into the workflow to efficiently manage and version control the large datasets typically associated with NLP tasks. This project aims to streamline the end-to-end process of training, tracking, and deploying NLP classification models, ensuring reproducibility and scalability while handling complex and evolving text data.

Sales Forecasting using Machine learning

Github This project will be based on Cross-industry standard process for data mining (CRISP-DM). A standard idea about data science project may be linear: data preparation, modeling, evaluation and deployment. However, when we use CRISP-DM methodology a data science project become circle-like form. Even when it ends in Deployment, the project can restart again by Business Understanding. How might it help?This project will be based on Cross-industry standard process for data mining (CRISP-DM). A standard idea about data science project may be linear: data preparation, modeling, evaluation and deployment. However, when we use CRISP-DM methodology a data science project become circle-like form. Even when it ends in Deployment, the project can restart again by Business Understanding. How might it help?

Stock Market Kafka Data Engineering-Project

Github Project Description Executed a comprehensive End-to-End Data Engineering Project involving real-time stock market data using Apache Kafka. Leveraged a stack of diverse technologies including Python, Amazon Web Services (AWS), Apache Kafka, AWS Glue, Athena, and SQL for seamless project execution. Programming Languages: Python Cloud Services: Amazon Web Services (AWS) S3 (Simple Storage Service) Athena Glue Crawler Glue Catalog EC2 (Elastic Compute Cloud) Streaming Platform: Apache Kafka

Data Engineering YouTube Analysis -Project

Github Project Description The project focuses on the efficient management and analysis of YouTube video data, categorized by trends and metrics. Key goals include creating a data ingestion mechanism, transforming raw data through an ETL system, establishing a centralized data repository (Data Lake), ensuring scalability, and leveraging AWS for large-scale data processing. Essential AWS services used include Amazon S3 for storage, AWS IAM for access management, QuickSight for BI insights, AWS Glue for data integration, AWS Lambda for serverless computing, and AWS Athena for S3 queries.

LangChain App-Large Language Modelling Project

Github “The MultiPDF Chat App is a Python-based application designed to facilitate discussions with multiple PDF documents. Users can engage in natural language conversations and pose questions regarding the content of these PDFs. The application leverages a sophisticated language model to furnish precise and contextually appropriate responses to user inquiries. It is essential to recognize that the application’s responsiveness is contingent upon the relevance of the questions posed to the loaded PDFs, as it is specifically tailored to address queries related to these documents”.

Data Scientist

Technical Skills: Python,R,SQL, AWS, Airflow,Kafka

Education

Work Experience

Data scientist/Machine learning research Engineer (April 2021 - October 2023, United Kingdom)

Data Science/Artificial intelligence project facilitator(Volunteer) (January 2023 -April 2023, United Kingdom)

Leeds Beckett University @ Balfour Beatty plc/ Leeds Beckett University (KTP) (November 2020- March 2021, United Kingdom)

Leeds Beckett University @ Leeds Beckett University (September 2017- October 2020, United Kingdom)

Data science insight analyst @ Sonorys Technology GmbH(January 2015- August 2017, Austria)

Talks & Lectures

Publications