Pa Alieu Dibba

|

Bridging data and intelligence, from scalable cloud pipelines to computer vision systems that see and understand the world.

Pa Alieu Dibba
scroll

About Me

I'm a Data Scientist and ML Engineer with a deep passion for Computer Vision and intelligent data systems. My work spans the full data lifecycle, from architecting cloud data warehouses on GCP BigQuery to building deep learning models that tackle real-world vision challenges.

At Otamiser, I contributed to building the company's first scalable data warehouse, orchestrated pipelines with Dagster, and applied NLP and AI models to analyze Airbnb listing dynamics. My academic research focuses on leveraging Attention Mechanisms in GANs for medical image analysis.

I thrive at the intersection of innovation and precision, turning messy, complex data into clean, actionable intelligence.

PythonSQL GCPdbt DagsterPyTorch Computer VisionNLP BigQuerySpark
0 + Projects Built
0 + Tools Mastered
0 Degrees
0 Roles
My Journey

Experience & Education

Mar 2025 – Present AutoRank by Otamiser · Ghent, Belgium

Data Scientist (Data Researcher)

  • Analyzed how Airbnb host badges influence listing rankings to identify visibility drivers
  • Applied NLP to evaluate how listing description structure impacts ranking performance and engagement
  • Analyzed listing image patterns to differentiate visual characteristics between low and high ranked properties
  • Investigated host qualities and their role in guest booking decisions
  • Optimized Airbnb listing content(titles, summaries, space descriptions) and generated structured reports on best formatting practices
  • Conducted Airbnb listing switch analysis and examined its correlation with ranking changes
  • Built automation pipelines for Airbnb and Booking.com listing onboarding/offboarding, reviews scraping, and property details extraction
  • Developed Airtable-to-database automation workflows using FastAPI
Oct 2024 – Mar 2025 Otamiser · Ghent, Belgium

Student IT (Data Engineer)

  • Contributed in the architect and deployment of the company's first scalable data warehouse on GCP BigQuery
  • Built modular SQL data transformations using dbt Core
  • Orchestrated and automated end-to-end data workflows using Dagster
  • Researching and implementing new data engineering technologies to optimize pipelines
2 Months (2024) Otamiser · Ghent, Belgium

IT Intern

  • Established an MDM framework with standardized policies for staff device management
  • Designed an automated, centralized security system integrating Twilio, Gmail, and Slack for monitoring and logging 2FA codes
  • Conducted comprehensive Google Workspace reviews to enhance security and efficiency
  • Optimized email security by configuring DMARC to mitigate spam and phishing threats
  • Contributed to a scalable Data Warehousing solution to streamline analytics accessibility

Skills & Technologies

Tools and technologies I use to build data-driven and intelligent systems.

Python90%

General-purpose powerhouse for ML, data science, scripting, and automation across all my projects.

SQL95%

Advanced query writing, performance tuning, and warehouse modeling — my most-used language daily.

dbt Core90%

Modular SQL transformations with version-controlled models, tests, and documentation in BigQuery.

Dagster80%

Asset-based pipeline orchestration — scheduling, monitoring, and managing reliable data workflows.

Google Cloud Platform85%

BigQuery, Cloud Storage, and GCP ecosystem for scalable modern data engineering pipelines.

Pandas / NumPy90%

Core data manipulation and numerical computing for analysis, preprocessing, and feature engineering.

PyTorch / Deep Learning82%

Building CNNs, GANs, ViTs, and RL agents — from research prototypes to trained models.

Computer Vision (OpenCV/YOLO)85%

Object detection, facial recognition, 3D reconstruction, and real-time vision systems.

Apache Spark / Databricks65%

Distributed processing for large-scale datasets — used for COVID-19 predictive analytics.

Power BI / Looker Studio / BI Reporting80%

Translating raw data into business dashboards and actionable insights for stakeholders.

PostgreSQL / MySQL75%

Relational database design, query optimization, and backup across academic and enterprise projects.

NLP75%

Sentiment analysis, text classification, and description quality evaluation using modern NLP techniques.

Portfolio

My Projects

AI Quiz

Computer Vision

Quiz-Based Question Answering System using LLMs

A smart quiz system using Large Language Models that verifies user responses and provides detailed feedback.

Smart House

IoT

Smart House Control App with Arduino

End-to-end IoT system controlling lights, fans, and sensors via Arduino and a Bluetooth mobile interface.

Traffic Sign Detection

Computer Vision

Traffic Sign Detection with Faster R-CNN

Implemented traffic sign detection using Faster R-CNN to improve automated road safety systems.

Facial Detection

Computer Vision

Facial Detection using MTCNN & SVM

Combined MTCNN face detection with SVM classifier for robust facial recognition task performance.

AR Avatar Fighting

Computer Vision

Augmented Reality Avatar Fighting

Developed an AR-based interactive avatar fighting system using augmented reality with Unity.

RL Agent ViT

Machine Learning

Reinforcement Learning Agent with Vision Transformer (ViT)

Integrated Vision Transformer with DQN to build an RL agent that classifies images via learned visual features.

YOLO Pedestrian

Computer Vision

Pedestrian Detection with YOLO & Explainable AI

Real-time pedestrian detection with YOLO, augmented by XAI methods to interpret model decisions.

VAE

Computer Vision

Image Compression & Generation with Variational AutoEncoder

VAE for high-fidelity image compression, reconstruction, and novel image generation from learned latent space.

PointNet

Computer Vision

3D Point Cloud Classification with PointNet

PointNet applied to directly classify unordered 3D point cloud input for accurate scene understanding.

3D Reconstruction

Computer Vision

3D Reconstruction with Open3D & MVS

Multi-view stereo and Open3D to reconstruct accurate 3D scenes from multi-image input.

MATLAB

Machine Learning

Image Processing with MATLAB

Applied MATLAB for image filtering, enhancement, and edge detection in a structured processing pipeline.

COVID

Data Science

COVID-19 Death Prediction using PySpark

Processed large-scale COVID datasets with PySpark to predict fatality rates across countries.

Twitter Sentiment

Data Science

Twitter Sentiment Analysis

Classified tweet sentiments as positive, negative, or neutral using NLP and machine learning techniques.

Preprocessing

Data Science

Automated Data Preprocessing Pipeline

Built an automated pipeline for cleaning, normalizing, and encoding datasets for ML applications.

Class Reservation

Web Development

Class Reservation Website

Designed a reservation platform for scheduling and managing class registrations efficiently.

📄

Download My Resume

Available in English and French. Last updated 2026.

Say Hello

Get in Touch

Let's work together

I'm open to data engineering, ML, or research roles. Whether you have a project in mind, a question about my work, or just want to connect, my inbox is always open.

✓ Thanks! I'll get back to you as soon as possible.