Journey & Focus

I began with hands-on analytics projects and production dashboards, expanded into pipelines and platform work across cloud and open-source stacks, and now pair that technical breadth with governance discipline and a responsible-AI lens. The through-line is simple: design systems that teams trust — data that is reliable, models that are evaluated, and decisions that meet compliance and real-world needs. Today I help organizations connect these dots: strategy, controls, and measurable outcomes.

GitHub Projects

Sample high-signal repos. A broader set lives on my GitHub.

Title Skills / Tools Repo / Live Description
Price Optimization Model Python Pandas SQL Numpy scikit-learn Price Optimization Plotly Dash Repo End-to-end pricing pipeline: demand modeling, profit-maximizing recommendations, backtests.
SHA Disbursement Dashboard Python Camelot PyPDF Pandas Plotly Dash Repo Automates extraction from SHA PDFs and visualizes disbursements across facilities/time.
Neonatal Outcomes Dashboard Python Dash Plotly express SQL Pandas Repo Sample analytics dashboard aligned to neonatal monitoring metrics.
Disaster Response Pipeline Python FastAPI Dagster dbt NLP Repo Multilabel text classifier + needs detection; ETL orchestration and ingestion pipeline.
Data Maturity & Governance Tool Python Data Governance Scoring Data Playbooks CDMP Repo Assessment utilities to score maturity, surface gaps and drive roadmaps.
Bike Sharing Analysis Jupyterlab EDA Time Series Analysis Machine Learning ARIMA Repo Explores drivers of demand in Washington, D.C. with temporal patterns.
Bank Marketing Campaign Analysis EDA Feature Engineering Classification Repo Campaign outcomes analysis; baselines for uplift/propensity modeling.
Sales Forecast (Time Series) Jupyter ARIMA Time Series Forecasting Prophet Repo Classic demand forecasting with ARIMA/Prophet; seasonality and trend components.
Redact PIIs Python NER spaCy Regex Repo Named-entity based PII redaction workflow; quick compliance helper.
Where the Devs Are Jupyter GeoPandas QGIS/Mapping Marimo Repo Geospatial exploration of developer distributions from Stack Overflow survey data.

Education Timeline

A progression from IT foundations to advanced analytics, machine learning, and governance. Each step — from technical degrees to professional certifications — builds toward expertise in data strategy, responsible AI, and enterprise-scale management.

  1. International University of Applied Sciences — Germany Ongoing

    MSc in Data Management

    • Advanced study of governance operating models, data architecture, quality frameworks, and metadata/lineage management.
    • Produced strategy documents and governance playbooks mapped to DAMA-DMBOK2 functions.
    • Applied GDPR and Kenya DPA principles to data lifecycle design and AI use-case risk reviews.
    • Explored research methods, survey design, and field data collection techniques applicable to data management practice.
    • Engaged with academic and applied perspectives on ethics, privacy, and responsible AI within organizational contexts.
    Data GovernanceData StrategyMetadata & Lineage Data QualityArchitectureETL/ELT Privacy (GDPR/DPA)Research MethodsData Collection EthicsDAMA
  2. DAMA International 2024–2025

    CDMP Certification

    • Certified in data management practices aligned with DAMA-DMBOK2R.
    • Developed frameworks for data governance, stewardship, and compliance.
    • Applied methods for data quality, metadata, and master/reference data management.
    • Covered topics on data architecture, integration, and warehousing design.
    • Training included risk, security, and ethical use of data within organizational contexts.
    GovernanceStewardshipData Quality MetadataMaster & ReferenceArchitecture IntegrationWarehousingModeling SecurityRiskEthics
  3. Udacity 2021

    Data Science Nanodegree

    • Completed hands-on projects across supervised and unsupervised learning, model evaluation, and deployment.
    • Applied data wrangling, feature engineering, and experimentation techniques to real-world datasets.
    • Built and deployed machine learning pipelines with attention to scalability and reproducibility.
    • Capstone: Disaster Response Pipeline — multilabel NLP classification with ETL and model orchestration.
    PythonPandas/NumPyscikit-learnSQL Feature EngineeringExperimentationModel Deployment ETLNLPMachine Learning
  4. Udacity 2021

    Data Analyst Nanodegree

    • Developed portfolio projects applying SQL, statistical inference, exploratory data analysis (EDA), and visualization.
    • Performed large-scale data wrangling and cleaning tasks to prepare datasets for analysis.
    • Applied statistical methods to evaluate hypotheses and uncover trends from real-world data.
    • Projects included OpenStreetMap data wrangling, weather trends analysis, and investigative reporting of custom datasets.
    SQL (Postgres)WranglingEDA Statistical InferenceVisualizationNotebooks PandasData CleaningStorytelling
  5. Multimedia University of Kenya (MMU) 2013 – 2019

    BSc in Information Technology

    • Completed coursework spanning database systems, networking, software engineering, and IT governance.
    • Applied SDLC best practices in individual and group projects, with emphasis on secure and maintainable design.
    • Final project: developed a real-time bidding application using Django/SQLite with authentication, authorization, and basic admin features.
    Database DesignNetworkingSoftware Development DjangoSQLite/PostgresAuthN/Z SDLCIT GovernanceSystems Analysis