Joseph EL HADDAD

Data & Automation Specialist

joseph.ergo@proton.me | Portfolio | Resume PDF | LinkedIn | +212 713-617-633

Morocco · On-site or remote (UTC+1)

Summary

Python data engineer specializing in large-scale web scraping and automated lead intelligence. Built pipelines processing 100,000+ records for clients in the US, UK, and UAE. Security-conscious, end-to-end: from raw extraction to structured, actionable datasets.

Experience

Data Lead Feb 2026 – Present

Hive Gate · Dubai (Remote)

Engineered an automated pipeline to scrape and structure 100,000+ target institutions, feeding directly into a user acquisition initiative.
Enriched raw records to surface high-value contact channels (WhatsApp, scheduling links), letting the sales team prioritize the highest-converting outreach paths.
Delivered segmented, ready-to-use datasets that cut the time from prospect identification to first contact.

Researcher Jan 2026 – Mar 2026

Happy City Index · London (Remote)

Sourced and verified data from official municipal publications, including direct contact with city authorities to resolve gaps.
Peer-reviewed contributions from other researchers to maintain dataset consistency and integrity.
Listed as a named contributor on the organization's public team page.

Data & Automation Analyst Jun 2024 – Mar 2025

Currier Marketing · California (Remote)

Automated SEO audits across client websites using Python, detecting 404 errors, duplicate content, and keyword cannibalization at scale.
Built targeted contact lists and aggregated business data for outreach campaigns and web applications.
Presented findings and proposed data initiatives directly to stakeholders, translating technical results into prioritized action items.

Projects

Discovery of a Google Maps Geolocation Anomaly

First to identify and document a systemic error placing 100,000+ U.S. businesses at a single erroneous ocean coordinate [46.4°N, 129.9°W] — 73% of affected records shared the same bad point. Findings were published in GoogleMapsMania.

Automated Prospect Intelligence Pipeline

Python-driven system that aggregates and cross-references data from Google Maps and corporate websites, then filters for high-value contact channels (WhatsApp, scheduling links, direct emails) to segment thousands of prospects by conversion likelihood.

Large-Scale Talent Intelligence Analysis

Local pipeline built with DuckDB to process and structure 1.4 TB of unstructured data. Outputs custom reports on company hiring trends, talent networks, and individual career trajectories.

Skills

Data engineering: Python, Pandas, Polars, DuckDB, NumPy, SQL
Extraction & automation: Selenium, Playwright, BeautifulSoup, Requests, OpenCV
Environment: Debian GNU/Linux, Git, GNU Emacs, Bash
Other: Microsoft Excel, Google Sheets, LLM APIs, R
Practices: Exploratory data analysis, data privacy, security-conscious development