Pablo X Zumba

About me

Hello! I'm a versatile and driven professional who recently graduated with a Master's degree in Business Analytics and Information Systems.
My background includes a Bachelor's degree in Electronic Engineering and an Associate's degree as a Biomedical Technician, showcasing my multidisciplinary approach to problem-solving and innovation.
Passionate about data, I am particularly intrigued by data engineering, analytics, and machine learning. I'm eager to apply my skills to uncover valuable insights and drive impactful change across various industries.


My thirst for knowledge doesn't end there, as I am an avid reader with a strong interest in science books covering topics such as history, artificial intelligence, and Stoic philosophy. A firm believer in the importance of work-life balance, I love exploring the great outdoors and participating in activities like hiking, biking, and swimming. I also enjoy the extreme sport of "emotional self-control", as it helps me cultivate resilience and adaptability in both my personal and professional life.


I am excited to connect with like-minded professionals and collaborate on projects that leverage my unique skill set while expanding my horizons in the ever-evolving world of data and technology. Let's get in touch and create something extraordinary together!

Download Resume

Download Resume

Technical Skills

Programming Languages

Spark, Python, R, SQL (SQLite, Microsoft SQL Server, MySQL) NoSQL (Cassandra, Hive, Pig, Impala), HTML, CSS, JavaScript

Frameworks and API

Databricks, Snowflake, Flask, Google Big Query, Chart js, Microsoft Machine Learning Studio, Cloudera

Software Tools

Tableau, Power BI, Wireshark, Jupyter, MS Office, Git & GitHub, Draw io, VS Code, DB Browser, Canva

Statistical Proficiencies

Descriptive and Inferential Statistics in R and Python; Excel (VLOOKUP and Pivot Tables)

Python Libraries

TensorFlow, Scikit-learn, Keras, Pandas, Matplotlib, NumPy, SciPy, Seaborn, BeautifulSoup

Cybersecurity

Risk management, security best practices, Security law/ethics, and secure software development and testing

Core Competencies

Design advanced systems processes using UML and OOP. Identifying IT-related product lifecycle challenges, depth case analysis and communication, managing technology-focused businesses, and presenting data analytics projects

Portfolio

Project 1

Attendance Prediction ML Models

Developing machine learning algorithms in Databricks - Apache Spark for the 2016 Baseball Season to improve profitability. Highest Accuracy Model ~83%.

View Project
Project 2

Inmates Reintegrate Program Database Design

Database design, development, and implementation for an inmate reintegration program using draw.io for modeling and Microsoft SQL Server & Azure for implementation.

View on GitHub
Project 3

Intelligent Tourist Website

Created a website using HTML, CSS, JavaScript, chart.js, and Python Flask to suggest tours and excursions based on solo travelers' preferences.

Visit Website

Watch the video walkthrough
Project 4

Divorce in the US

Created a thorough report on the possible reasons and causes for divorce in the United States using SQLite, R-Studio, and Tableau. The purpose was to help a psychologist concentrate on significant factors during marital therapy.

View Project Report
Project 5

E-commerce Database Design

A database design, development, and implementation project for an e-commerce company utilizing Draw.io for ERD modeling and Microsoft SQL Server & Azure for implementation.

View Project Report
Project 6

Life expectancy and GDP Project

The purpose of this project is to investigate the relationship between life expectancy and Gross Domestic Product (GDP) in six countries using Python and JupyterLab.

View Project on GitHub
Project 7

Patient Demographic and Health Analysis

In this project, we undertook a comprehensive study of a patient dataset with a multi-faceted objective. Our primary goal was to extract meaningful information about the population distribution and health-related costs while considering lifestyle and family attributes.

View Project on GitHub
Project 8

Biodiversity in National Parks

This project performs an analysis of data on the conservation status of endangered species in different national parks and investigates whether there are patterns or issues related to the types of species that are endangered.

View Project on GitHub
Project 9

Web-Scraping with Beautiful Soup

This is a project about how to scrape data from a given website. This project will use pyplot, numpy, pandas, request and BeautifulSoup libraries. The website is hosted in AWS and has over 1700 reviews of chocolate bars from all around the world.

View Project on GitHub
Project 10

Design of Medical Big Data Systems

This is a research paper about the non-functional requirements for the design of Medical Big Data Systems.

View Project
Project 11

R code to determine whether a variable is normally distributed

The goal of this project is to demonstrate how to use R to determine if a variable is normally distributed by using histograms, boxplots, QQplots, and QQlines.

View Project on GitHub
Project 12

Quantitative Analysis of Credit Unions: An Application of Statistical Methods in R

Investigating Member Sizes and Total Assets in Florida, California, New York, and New Jersey Credit Unions using Confidence Intervals and Hypothesis Testing

View Project on GitHub
Project 13

Exploratory Data Analysis and Regression Modeling of U.S. Domestic Airfares

Analyzing Airfare and Passenger Data for Selected U.S. Domestic Routes with R: Data Preprocessing, Descriptive Statistics, and Regression Analysis

View Project on GitHub
Project 14

Time Series Analysis and Forecasting of US Domestic Flight Passengers

Modeling Seasonality in Passenger Volume: Applying Regression and Durbin-Watson Test to De-seasonalize and Reseasonalize US Flight Data

View Project on GitHub
Project 15

Comparative Analysis of Vehicle Listings on Craig's List: Applying Stratified Sampling and ANOVA in R

An analysis of regional and cylindrical variations in asking prices and odometer readings in five U.S. regions

View Project on GitHub
Project 16

Predictive Modeling of Home Sales in Hunters Green: Analyzing Days on Market and Sale Prices

Exploring the Key Predictors of Property Sale Duration and Price through Regression Models and Hypothesis Testing in R

View Project on GitHub
Project 17

Predictive Analysis of Customer Churn in Telecom Services: A Multimodel Approach in R

Examining Predictors of Churn Among Telephone, Internet, and Dual-Service Subscribers: Logistic Regression Models, Performance Metrics, and Classifier Tuning

View Project on GitHub
Project 18

Survival Analysis in Medical Trials: Comparing Patient Outcomes on Standard vs. Test Treatments

Utilizing Kaplan-Meier Estimates and Parametric Models to Assess Survival Probabilities and Influence of Age and Diagnosis Months on Treatment Efficacy

View Project on GitHub
Project 19

Analyzing Retail Sales and Promotions: A Comprehensive Study on Pricing, Product Categories, and Store Segments

Unraveling the Impact of Promotions and Pricing on Product Sales and Identifying the Most Elastic Products in a Large Retail Chain

View Project on GitHub
Project 20

The Dynamics of MLB Game Attendance: An In-depth Analysis Using Machine Learning Approaches

Evaluating and Comparing Predictive Models for Attendance at Major League Baseball Games

View Project on GitHub
Project 21

Predicting Salaries from Job Postings: A Machine Learning Approach

Evaluating Predictive Models for Job Salaries Using Text-based and Numeric Features

View Project on GitHub
Project 22

Predicting Heartbeat Anomalies: A Comparative Study of Neural Networks

Evaluating Multiclass Classification Models for Heartbeat Measurements

View Project on GitHub
Project 23

A Template for Exploratory Data Analysis and Preparation

Understanding and Preprocessing Data for Advanced Analytics

View Project on GitHub
Project 24

House Price Prediction in King County, WA using Regression Techniques

Building and Comparing Various Regression Models to Predict House Sale Prices

View Project on GitHub
Project 25

Loan Default Prediction in the Banking Industry using SVM Models

Preventing Bad Loans by Predicting Loan Default Using Various SVM Techniques and Hyperparameter Optimization

View Project on GitHub
Project 26

Predicting Hospital Readmission in Diabetic Patients Using Decision Tree Models

Improving Healthcare Performance Metrics through Enhanced Predictive Modelling and Hyperparameter Optimization

View Project on GitHub
Project 27

Predicting Underage Drinking in High School Students Using Ensemble Machine Learning Methods

Harnessing the Power of Ensemble Learning to Detect and Intervene in Underage Drinking Cases

View Project on GitHub
Project 28

Predicting Patient's Smoker Status Through Text Mining and Machine Learning | SCIKIT-LEARN

Leveraging Latent Semantic Analysis and Stochastic Gradient Descent Classifier for Healthcare Predictive Analytics

View Project on GitHub
Project 29

Preventing Bad Loans with Machine Learning: An Exercise in Binary Classification and Feature Engineering in the Banking Industry

Leveraging Neural Networks to Predict Loan Outcomes

View Project on GitHub
Project 30

Predicting Loan Default Risk in the Banking Industry: An Exercise in Binary Classification, Feature Engineering and Neural Networks using Keras

Designing Shallow and Deep Neural Networks for Risk Assessment in Home Loans

View Project on GitHub
Project 31

Convolutional Neural Network for Lego Brick Classification

Image Recognition Model for Differentiating Types of Lego Bricks

View Project on GitHub
Project 31

Power Grid Stress Prediction using Recurrent Neural Networks

Implementing and Comparing Different RNN Architectures for Time Series Data

View Project on GitHub

Contact