Data Scientist & Analytics Leader

Omoyeni
Ogundipe

Founder, YÉNI & Omoyeni.io · ex-Amazon

Houston, TX

Results-driven Data Scientist and Analytics Leader with 6+ years of experience delivering end-to-end data solutions across machine learning, business intelligence, and data engineering. Expert at transforming complex data into measurable outcomes across finance, marketing, healthcare, e-commerce, and enterprise technology — from predictive modeling and experimentation to scalable ETL pipelines and executive-facing BI.

6+
Years Experience
93%
ML Model Accuracy
4.0
M.S. GPA
35%
ETL Time Reduction
01

Technical Skills

Analytics
  • Python · SQL · R · PySpark
  • NumPy · Pandas · Scikit-learn
  • TensorFlow · PyTorch
  • Statistical Inference & Modeling
  • A/B Testing & Experimentation
  • Time-Series Forecasting
Data Engineering
  • Snowflake · Redshift · AWS S3
  • Apache Airflow · dbt
  • Hadoop · Spark
  • ETL / ELT Pipeline Design
  • Data Warehousing & Modeling
  • KPI Frameworks
ML, BI & Cloud
  • Predictive Modeling · NLP · Deep Learning
  • Feature Engineering · Model Deployment
  • Tableau · Power BI · QuickSight
  • Matplotlib · Plotly
  • AWS (S3, Redshift, QuickSight)
  • Recommendation Systems
02

Experience

Current
YÉNI + Omoyeni.io
Houston, TX
Founder · Software Engineer & Data Professional
  • Founded and lead development of technology-driven consumer and digital brands spanning luxury products, software platforms, and analytics solutions.
  • Architect and build analytics infrastructure and dashboards tracking customer behavior, acquisition channels, and product performance.
  • Design and develop full-stack web platforms and backend integrations supporting e-commerce launches.
  • Lead end-to-end product development including requirements, UI/UX, software implementation, and digital marketing strategy.
Full-StackAnalytics InfrastructureProduct StrategyE-Commerce
Previous
Amazon
Seattle, WA
Business Intelligence Engineer
  • Engineered analytics and experimentation frameworks for Amazon Photos, driving a 12% increase in DAU and 14% growth in MAU.
  • Designed and executed A/B tests and cohort analyses, resulting in a 14% reduction in user churn.
  • Built distributed ML models in PySpark and Spark to forecast user re-engagement with 93% accuracy.
  • Automated ETL pipelines using SQL, dbt, and Airflow, reducing data processing time by 35% and improving data quality.
  • Developed self-serve Tableau and QuickSight dashboards adopted by cross-functional leadership.
PySparkdbtAirflowA/B TestingTableauQuickSight
Previous
Dream Chase Technologies
New York, NY
Business Intelligence Engineer
  • Led analytics and deep learning initiatives that improved platform DAU by 10% through targeted engagement strategies.
  • Designed and deployed TensorFlow-based recommendation models, increasing content satisfaction scores by 18%.
  • Built interactive Power BI dashboards and automated reporting pipelines.
TensorFlowDeep LearningPower BIRecommendations
Previous
Carrier
Georgia, USA
Data Scientist
  • Engineered neural-network fraud detection models improving anomaly detection accuracy by 20%.
  • Developed pricing optimization models using Python and R, contributing to a 15% revenue increase.
  • Applied K-Means and hierarchical clustering for customer segmentation, enabling personalized pricing strategies.
  • Automated pricing analytics pipelines with dbt and SQL, significantly reducing report turnaround times.
Fraud DetectionClusteringPricing Optimizationdbt
Previous
Camp House
Remote
Data Scientist
  • Developed collaborative-filtering recommendation systems in Python, increasing user engagement by 30%.
  • Designed and analyzed A/B tests on user behavior, improving conversion rates by 15%.
  • Delivered Power BI dashboards for sentiment analytics, boosting campaign effectiveness by 20% and driving a 25% expansion in market share.
RecommendationsA/B TestingETLSentiment Analysis
03

Projects

Industry Work
I–01 · Big Tech · Consumer Product
Re-Engagement Prediction Engine — 150-Variable ML Pipeline
93% Accuracy · 150+ Features · Production PySpark

Built a production-grade PySpark machine learning pipeline to predict user re-engagement for a consumer product operating at hundreds of millions of users. The model ingested over 150 predictor variables spanning behavioral signals — session frequency, feature interaction depth, content engagement patterns, notification response rates, cross-device activity, and dormancy duration — combined with user lifecycle attributes, historical retention markers, and product usage cadence indicators.

Engineering the feature set at this scale required building distributed Spark transformations across a massive data lake, including custom aggregation windows across multiple time horizons, lag features capturing behavioral decay, and robust imputation strategies for sparse signals across infrequent users. The complexity was significant: any naive join or aggregation at this data volume would collapse; every transformation had to be written with partitioning and execution plan efficiency in mind. Multiple classification algorithms were benchmarked — Gradient Boosted Trees, Random Forest, Logistic Regression, and XGBoost — with the final tuned ensemble achieving 93% accuracy with strong precision-recall balance to avoid over-notifying users with low re-engagement probability. Model outputs fed directly into the churn reduction strategy, informing targeting logic for re-engagement campaigns and contributing to a 14% reduction in user churn.

PySparkFeature EngineeringGBTXGBoostRandom ForestClassificationBig DataAWS
I–02 · Big Tech · Consumer Product
Org-Wide Product Metrics Source of Truth Dashboard
Daily · Weekly · Monthly · Analytics Engineering at Scale

Built the organization's single source of truth for product metrics — a mega-dashboard serving daily, weekly, and monthly performance tracking for the entire product org and leadership. The challenge wasn't just visualization; it was the serious analytical engineering underneath it. The data was spread across dozens of disparate tables across multiple pipelines, requiring complex multi-table joins, careful deduplication, incremental aggregation logic, and layered data transformations to produce a unified, consistent dataset.

Built the entire data layer from scratch — designing intermediate staging tables, writing optimized SQL transformations that could run efficiently at org scale without query timeouts or data freshness lag, and engineering dbt models to handle the dependency chain reliably. The final dashboard consolidated active users, engagement depth, retention curves, feature adoption, and content metrics across daily, weekly, and monthly time grains — each with consistent metric definitions so that numbers never conflicted between views. Leadership used this as the primary lens for product reviews, roadmap decisions, and organizational health assessment. It became the trusted, go-to view that replaced ad hoc reporting across the org.

Analytics EngineeringdbtSQLTableauQuickSightData ModelingETLKPI Design
I–03 · Big Tech · Consumer Product
Product Performance & Experimentation Dashboard
Self-Serve Analytics · Experiment Evaluation: Days → Hours

Designed and built a Tableau dashboard that became the single source of truth for product health and A/B experimentation across a large consumer product. Before this, evaluating feature performance required ad hoc SQL queries routed through analysts, creating multi-day delays and inconsistent metric definitions that led to conflicting interpretations of the same experiment across teams.

The dashboard centralized KPI scorecards (DAU, MAU, engagement, retention), variant-level experiment comparisons with statistical significance indicators, funnel analysis across key user journeys, and cohort views for post-launch behavioral tracking. On the backend, built optimized SQL pipelines aggregating data from multiple sources into clean, analysis-ready tables — and implemented standardized experiment metric logic so that teams were always comparing apples to apples. Product managers could evaluate launches in near real-time and decide to scale, iterate, or roll back within minutes instead of days. The experimentation views became a fixture in weekly product reviews, and self-serve adoption across the org increased significantly.

TableauSQLA/B TestingFunnel AnalysisCohort AnalysisKPI DesignSelf-Serve BI
I–04 · Big Tech · Consumer Product
First-Touch & Last-Touch Attribution Model
10M+ Users · Multi-Channel Attribution · SQL

Built a first-touch and last-touch attribution framework in SQL to map how users across a digital product with over 10 million users discovered and adopted key features. First-touch attribution identified the original acquisition channel or entry point that introduced a user to the product — critical for understanding which awareness channels drove the highest-quality, highest-retention users. Last-touch captured the final interaction before a meaningful conversion event, revealing which touchpoints were actually closing adoption decisions.

The model was built on top of a full user event stream, requiring careful session reconstruction, deduplication logic, and handling of multi-device journeys where a single user might interact across web, mobile, and embedded surfaces. The output enabled marketing and product teams to accurately assess channel efficiency, reallocate acquisition spend toward highest-performing sources, and identify friction points in the adoption funnel where users were dropping off before converting — directly informing campaign strategy and onboarding redesigns.

SQLAttribution ModelingFunnel AnalysisEvent StreamsUser AcquisitionMulti-Channel
I–05 · Global Manufacturing · Enterprise
Equitable Pricing Optimization Model — 94% Accuracy
94% Accuracy · 15% Revenue Increase · Tree & Ensemble Models

Built a robust, production-grade pricing optimization model for a global enterprise — one that was not only technically rigorous but also designed around legally acceptable, equitable pricing principles. The model was built on behavioral data that met strict fairness and compliance requirements, augmented with external market data, competitor pricing signals, inflation indices, and macroeconomic indicators to create a multi-dimensional view of pricing conditions. The goal was dual: maximize company profitability while simultaneously preserving customer satisfaction and long-term retention.

Experimented extensively with tree-based and ensemble approaches — Decision Trees, Random Forest, Gradient Boosted Trees, XGBoost, and LightGBM — evaluating each on both predictive accuracy and interpretability to ensure the pricing logic could be explained and audited. The final model achieved 94% accuracy in predicting the price at which individual customers were willing to purchase, enabling truly personalized pricing at scale. The model was then operationalized by scaling it and ingesting it directly into a Power BI environment, building a dashboard that allowed commercial and finance teams to run pricing scenarios, simulate revenue outcomes, and deploy pricing decisions in real time. Following A/B testing of the new pricing strategy against a control group, results showed a 15% increase in revenue, a significant increase in total orders and order rate, and measurable improvement in customer satisfaction scores related to perceived pricing fairness.

XGBoostLightGBMRandom ForestGBTPricing OptimizationPower BIA/B TestingMarket Datadbt
I–06 · AI Interior Design · Founder Project
AI Interior Design Generation Engine
Image-In · Design-Out · End-to-End Generative AI Pipeline

Built the core AI engine powering Omoyeni.io — an end-to-end generative AI system that takes a photo of a user's physical space as input and produces a fully redesigned or newly designed version of that space based on their style preferences and interior design specifications. The pipeline begins with image understanding: a computer vision model analyzes the uploaded photo to detect spatial structure, existing furniture placement, lighting conditions, room dimensions, and architectural elements — forming a semantic map of the space.

This spatial understanding is combined with the user's stated design preferences — style (minimalist, maximalist, Scandinavian, industrial, etc.), color palette, material preferences, and budget tier — to condition a fine-tuned diffusion model that generates photorealistic redesign outputs. The generative backbone leverages Stable Diffusion with ControlNet conditioning to preserve the room's structural geometry while replacing and redesigning surfaces, furniture, lighting, and décor. Style transfer and inpainting techniques allow targeted redesign of specific zones without disrupting the overall spatial layout. The output image is matched against a product catalog using visual similarity search (CLIP embeddings + vector database) to surface shoppable furniture and décor items that match the generated design, which users can add directly to their cart. A Tasker service integration completes the loop — users can book a professional to physically implement the design, making the platform a true end-to-end interior design experience from inspiration to installation.

Stable DiffusionControlNetComputer VisionCLIPVector SearchInpaintingStyle TransferGenerative AIPython
I–07 · Tech Platform · Consumer
Customer Pulse — NLP Sentiment & Review Analytics
Real-Time Power BI · NLP Sentiment · Daily Refresh

Developed a Power BI application for Customer Pulse Analytics that used NLP to automatically classify customer reviews and feedback as positive or negative in real time. The dashboard included sentiment analytics, chatbot interaction analytics, and review trend tracking — all refreshing daily via an automated pipeline so the team could monitor customer perception continuously rather than reactively.

The system enabled the team to see analyses of customer reviews daily, implement data-driven decisions in real time, and respond to emerging sentiment shifts before they escalated. Proactive monitoring of customer perception ensured product and support decisions were grounded in live feedback rather than lagging reports — directly contributing to a 10% improvement in platform daily active users.

NLPSentiment AnalysisPower BITensorFlowReal-Time PipelineText Classification
I–08 · Digital Platform · Consumer
Collaborative Filtering Recommendation System
30% Engagement Increase · Cold-Start Solved · A/B Tested

Built a collaborative filtering recommendation system from scratch in Python for a platform with a growing but not yet large customer base — making standard collaborative filtering challenging due to data sparsity. Solved the cold-start problem by constructing a user-item interaction matrix representing all historical user-booking relationships, then computing Pearson correlation coefficients to identify similarity between users based on behavioral patterns. Neighbourhood selection identified subsets of the most similar users, which powered personalized recommendations for both existing and new users.

After deployment, ran a rigorous A/B test using simple random sampling and stratified cookies to ensure clean group separation — control group saw the previous system, test group experienced the new recommendation engine. Measured performance across engagement metrics including time on site, click-through rate, and purchase rate. Resulted in a 30% increase in user engagement and a 15% improvement in conversion rates.

Collaborative FilteringPythonPearson CorrelationA/B TestingUser-Item MatrixCold Start
Open Source & Academic
↗︎
OS–01 · Open Source
Stock Price Prediction Using ML Algorithms
Udacity Data Science Nanodegree Capstone

Applied multiple ML algorithms to predict stock price movements. Complete pipeline from data acquisition through feature engineering to model evaluation and comparison.

PythonScikit-learnTime SeriesRegression
↗︎
OS–02 · Open Source
Churn Prediction — PySpark at Scale
93% Accuracy · Distributed ML Pipeline

Predicted customer churn for a music streaming platform (Sparkify) using distributed PySpark infrastructure. Scalable feature engineering handles massive datasets for production-grade retention insights.

PySparkBig DataFeature EngineeringClassification
↗︎
OS–03 · Open Source
NLP Disaster Response Classifier
Real-Time Flask Inference API

Multi-class text classifier using NLTK and Scikit-learn that categorizes emergency messages and routes them to the correct response team in real time.

NLPNLTKFlaskMulti-class
↗︎
OS–04 · Open Source
Airbnb Price Prediction — Seattle
10% Reduction in Mean Absolute Error

Random Forest and XGBoost regression models for Seattle Airbnb pricing. Deep EDA uncovered seasonal trends and neighborhood dynamics; feature engineering drove a 10% MAE improvement.

XGBoostRandom ForestEDARegression
↗︎
OS–05 · Open Source
Insurance Fraud Detection
18% Improvement in Detection Precision

Deep learning classification model for insurance fraud. Achieved an 18% improvement in detection precision through architecture tuning and advanced feature construction.

Deep LearningTensorFlowClassificationAnomaly Detection
↗︎
OS–06 · Open Source
Rising Interest Rates on Fixed Income Portfolios
Statistical Analysis & Research

In-depth statistical analysis examining how rising interest rates impact fixed income instruments and portfolios. Includes quantitative modeling, scenario analysis, and research presentation.

RStatistical AnalysisFinancial Modeling
04

Research

ResearchGate · Sep 2024
Integration of Machine Learning Algorithms for Real-Time Risk Assessment in Financial Trading Systems

Investigates the integration of ML algorithms — Random Forests, XGBoost, and Deep Neural Networks — for real-time risk assessment in high-frequency financial trading. Improves identification and mitigation of risks associated with financial market fluctuations by analyzing historical price trends, trading volume anomalies, and market volatility indices.

↗ Read on ResearchGate
2024
Int. Journal of Business & Economics Research · Vol. 12, No. 2 · 2023
Application of Statistical Inference Using Entropy to Characterize the Transfer of Data Across Financial Systems

Proposes a novel approach to quantify information transfer between financial systems using Transfer Entropy and Kullback-Leibler divergence measures, enabling deeper understanding of inter-system data flow dynamics. Cited in subsequent research on statistical inference applications in both business and medical contexts.

↗ Read on ResearchGate
2023
ResearchGate · Sep 2024
Leveraging AI for Financial Security in Emerging Markets

Examines how ML, NLP, and predictive analytics address financial security challenges in emerging markets — including vulnerability to fraud, regulatory compliance, and financial instability — through comprehensive case-study review, highlighting benefits, limitations, and future directions for AI-driven financial security.

↗ Read on ResearchGate
2024
05

Writing

I write on Medium about data science, machine learning, analytics engineering, and the intersection of data with real-world business decisions — practical perspectives from 6+ years in the field.

Read on Medium
06

Education

M.S. Computer Science & Quantitative Methods
Austin Peay State University
Mathematical Finance
GPA: 4.0 / 4.0

Supervised & unsupervised learning, software engineering, deep learning, data engineering, A/B testing, experimental design, and recommendation systems. Built end-to-end ETL, NLP, and ML pipelines on real-world datasets.

B.S. Accounting
Ajayi Crowther University
Undergraduate
GPA: 4.58 / 5.0

Strong quantitative and analytical foundation in financial systems, accounting principles, and business economics — directly informing work at the intersection of data science and financial modeling.

07

Let's
Connect

Open to data science collaborations, consulting engagements, research partnerships, and conversations about how analytics can drive meaningful impact.

Founder & Entrepreneur

Building
Ventures
that Matter

Alongside a career in data science, Omoyeni has founded and co-founded multiple companies — from multi-chain brick-and-mortar businesses to AI-powered software platforms. Each venture is rooted in the same belief: that good data and bold execution create lasting value.

5
Ventures Founded
2
Active Companies
3
Profitable Exits
Appetite to Build
01

Ventures

From healthy food chains and logistics businesses to AI-powered immigration tools and luxury consumer brands, Omoyeni has built across industries with a common thread — data-informed strategy and product thinking.

2025 – Present
Active
YÉNI
Luxury Consumer Brand

A luxury consumer brand founded and led by Omoyeni. YÉNI represents a bold intersection of identity, design, and quality — built for a discerning audience and powered by data-driven marketing and product strategy.

E-CommerceLuxuryConsumerBrand
shopyeni.com
2025 – Present
Active
Omoyeni.io
AI-Powered Interior Design Platform

An innovative AI interior design platform where users upload photos of their space and receive AI-generated design and redesign concepts tailored to their style. What makes Omoyeni.io truly unique is its end-to-end experience — users can browse the design, add furniture and décor items directly to their cart from within the platform, and book a Tasker service to physically bring the design to life. A genuine one-stop shop from inspiration to installation.

AIInterior DesignE-CommerceSaaSMarketplace
omoyeni.io
2024 – 2025
Exited · Co-Founded
Visa Companion
AI-Powered Immigration Platform

Co-founded an AI-powered immigration platform designed to simplify and guide users through complex visa and immigration processes. Visa Companion leverages intelligent automation and natural language interfaces to make immigration more accessible and less overwhelming.

AILegalTechImmigrationNLPSaaS
visacompanion.ai
2023 – Present
Active
AI Pathfinder
AI Education & Career Navigation

Founded AI Pathfinder to help individuals and organizations navigate the rapidly evolving AI and data science landscape. The platform provides education, guidance, and resources to empower people to build meaningful careers and capabilities in artificial intelligence.

AIEdTechCareer GuidanceMachine Learning
aipathfinder.io
2019 – 2021
Exited · Sold 2021
Stanwick Enterprise
Multi-Chain Consumer Business · Founded & Exited

Founded and scaled a multi-chain enterprise across three distinct consumer verticals, all operating profitably before a successful sale in 2021. Stanwick Enterprise demonstrated Omoyeni's ability to build, operate, and exit businesses at the intersection of consumer demand and operational excellence.

Stanwick Foods — A healthy food and café concept built around clean ingredients, modern branding, and a loyal customer base.
Stanwick Logistics — A B2B, B2C, and C2C delivery service providing last-mile logistics solutions for businesses and individuals.
Stanwick Interiors — An interior design services and retail business offering bespoke design solutions and curated home products.
Food & BeverageLogisticsInterior DesignRetailMulti-Chain