Available for new opportunities

Hi, I'm M Faisal Chughtai
Full Stack · Data Engineer · ML

I've spent the last 6+ years building things for the web: from real-time trading platforms and SaaS booking systems to data pipelines and ML models. I genuinely enjoy the full journey, from a clean React component to a Kafka stream or a SHAP plot. Currently doing my M.S. in Data Science at Aston University while keeping one foot firmly in industry.

Technical Arsenal

A multi-disciplinary engineer spanning the full spectrum: from pixel-perfect interfaces to distributed data pipelines and ML-driven insights.

Frontend Engineering

React.jsNext.jsTypeScriptJavaScriptTailwind CSSRedux ToolkitZustandFramer MotionReact NativeViteHTML5 / CSS3

Backend Engineering

Node.jsExpress.jsREST APIsGraphQLWebSocketsSocket.ioJWT AuthMicroservicesDockerCI/CD / Jenkins

Databases & Storage

MongoDBMySQLPostgreSQLFirebaseRedisElasticsearchMinIO (S3)MongooseNoSQL DesignSQL Optimization

Data Engineering

KafkaMonstacheKibanaETL PipelinesElasticsearch IndexingData WarehousingMongoDB AggregationApache Spark (Learning)Real-time Streaming

Data Science & Analytics

PythonPandasNumPyScikit-learnMachine LearningStatistical ModelingData VisualizationJupyter NotebooksTensorFlow (Learning)Power BI / Recharts

DevOps & Cloud

DockerRancherJenkinsDigitalOceanAWSAzureVercelNginxGit / GitHubLinuxEnvironment Config

Professional Journey

M.S. in Data Science

2025 - Present
Aston University Birmingham

Pursuing an advanced Master of Science degree in Data Science, focusing on scalable data architectures, machine learning methodologies, and advanced analytical modeling.

Postgraduate Mentor

2025 - 2026
Aston University Birmingham

Selected for the Postgraduate Mentoring Scheme, supporting fellow students through academic guidance and fostering a collaborative, supportive community. Certified by Professor Osama Khan, Deputy Vice-Chancellor.

Travel Consultant

Sep 2025 - Present
Terrific Travels LTD

Part-time on-site role in Birmingham, consulting on travel bookings and providing customer-facing travel advisory services alongside postgraduate studies.

IELTS : CEFR Level C1

Certified
British Council (Academic English)

Achieved CEFR Level C1 (Effective Operational Proficiency) in the International English Language Testing System, demonstrating advanced academic and professional English communication skills.

Lead Frontend Developer / Development Manager

June 2022 - September 2024
QMH Technologies PVT LTD (UK Based)

Built trading, fintech, and booking platforms using React, Vue.js, Node.js, Elastic Search, SQL, and Docker. Developed an e-commerce platform with React, Next.js, and Magento REST APIs. Enhanced address search using Elastic Search, Kibana, Monstache, and MongoDB for accurate data retrieval.

Full Stack Developer

February 2021 - June 2022
BEPAKISTANI.PK

Built user interfaces for BEPakistani.pk, RecipesAlert.com, PoetryAlert.com, and 12+ mobile apps using Next.js, React.js, Firebase, Redux, and React Native. Improved site performance achieving Web Core Vitals standards (LCP, FCP, FID, CLS).

Unity Game Developer

December 2018 - November 2020
Robaitec (SMC Private) Limited

Designed core game systems and cross-platform compatibility across Windows, macOS, iOS, and Android. Wrote clean C# code and optimized assets.

Featured Work

Microservices Enterprise

SaaS Taxi & Courier Management

A colossal multi-tenant SaaS ecosystem featuring 3 independent React frontends and 4 decoupled backends. Delivers intelligent dispatching, global payment integrations, and centralized Elasticsearch analytics.

TypeScriptNode.jsMongoDBElasticsearchMinIODockerReact 18ZustandStripe
Desktop CRM Application

WhatsApp Agent Desktop App

An enterprise desktop application enabling remote customer support teams to collaboratively manage WhatsApp Business communications in real-time. Built as a multi-agent Electron app utilizing live Socket.io messaging, FFmpeg media processing, and Webhook integrations via Meta APIs.

ElectronReact 19Socket.ioNode.jsExpressMongoDBZustandFFmpegVite
Data Engineering · Cloud Architecture

Azure Real-Time Sales Analytics

Enterprise-grade end-to-end real-time analytics pipeline for a global e-commerce retailer. Streams 600K+ records through Azure Event Hubs → Stream Analytics → SQL Database, visualized live in Power BI via DirectQuery — eliminating batch-report delays entirely.

Azure Event HubsAzure Stream AnalyticsAzure SQL DBAzure VMPower BIPythonPandasT-SQL
Data Engineering · Serverless Architecture

Azure Automated Weather Pipeline

Production-grade serverless pipeline for a global logistics company, automatically ingesting daily weather & AQI data from 24 global cities via OpenWeatherMap APIs. Orchestrated by Azure Data Factory: Azure Function → Blob Storage → SQL Database → Power BI dashboard with risk KPIs, conditional formatting, and geographic mapping.

Azure FunctionsAzure Data FactoryAzure Blob StorageAzure SQL DBPower BIPython 3.11PandasOpenWeatherMap API
ML Engineering · Big Data · MLOps

Real-Time Fraud Detection Pipeline

End-to-end production fraud detection platform. Kafka streams credit card transactions into a PySpark cluster for real-time feature engineering, persists to PostgreSQL, transforms via dbt, and serves sub-100ms predictions through a FastAPI inference API — all orchestrated by Apache Airflow in a 15-container Docker environment.

Apache KafkaPySparkApache AirflowdbtFastAPIscikit-learnPostgreSQLMinIODocker
SaaS Taxi Dispatching Platform

Caberly

A cloud-based web application enabling real-time ride booking, intelligent driver allocation, and live trip tracking. Features automated dispatching, GPS tracking, fare estimation, and dedicated fleet management dashboards.

ReactTypeScriptRedux-SagaNode.jsMongoDBSQLElastic SearchKafkaRedisDockerMicroservicesCI/CDJenkinsRancher
Digital Color Platform

ColorPouch

A comprehensive palette generator providing accessibility-tested color schemes and creative tools like a Color Wheel, Gradient Maker, and Image-to-Palette extractor for designers and developers.

Next.jsReactTypeScriptTailwind CSSFramer MotionMySQLCuloriColor Thief
Global Trading Platform

4XHUB

A high-speed global CFD trading platform for automated and manual execution. Features ultra-low latency, real-time market updates, and a strict type-safe frontend connected to a robust PHP backend.

ReactTypeScriptRedux ToolkitTailwind CSSPHPMySQLMetaTrader Integration
Booking System

QSS/QSG

A specialized website launcher and comprehensive booking management system.

HTMLCSSJavaScriptCore PHP
E-Commerce Platform

Tamadres

A specialized Turkish book e-commerce website featuring a scalable catalog and seamless purchasing experience.

Digital News Platform

BEPAKISTANI.PK

An independent digital news publishing platform providing extensive coverage across technology, business, automobiles, and more. Features SEO optimization, a custom Rich-Text editor, and internationalization.

ReactReduxFirebaseExpressBootstrapDraft.jsReact Helmet

Other Notable Work

Aston University · M.S. Data Science

Academic Research

Postgraduate coursework applying advanced statistical, machine learning, and data engineering methodologies to real-world datasets.

View Official Transcript (Gradintel)

Final Research Project (AM41PRA)

Aston University, M.S. Data Science : Dissertation (Work in Progress)

MSc Dissertation
1

Valuation of Collectable Cards: Intrinsic and Extrinsic Factors

Developing a machine learning framework to predict valuations for TCGs like Pokémon and Magic: The Gathering. Modelling intrinsic factors (age, condition) and extrinsic drivers (playability, influencer trends, grading populations) using regression and time-series analysis.

Machine LearningRegressionTime-SeriesData ScrapingPythonPredictive Modelling
2

Market Analysis and Price Prediction

Scraping historical price data to assess volatility and reliability of future set valuations. Investigating the impact of third-party grading (PSA/BGS) as a measure of demand and market efficiency in alternative investment spaces.

ScrapyData AnalysisMarket EfficiencyFinancial Modelling

Specialist Research Skills & Techniques (AM41RSA)

Aston University, M.S. Data Science · Module Coursework

Coursework ProposalGitHub Repo
1

Research Design: GB System Inertia Forecasting

Proposed forecasting GB power system inertia using a Bi-LSTM model benchmarked against ARIMA and XGBoost, trained on NESO 30-minute settlement data across wind, solar, thermal, and interconnector sources. Addresses the growing stability risk as renewable generation displaces synchronous machines.

Bi-LSTMARIMAXGBoostTensorFlowTime-SeriesLaTeXNESO Data
2

Explainability Framework: SHAP Analysis

Designed a SHAP-based explainability layer to break down individual inertia predictions into per-feature contributions. Includes beeswarm plots, force plots, and time-of-day SHAP patterns to make model behaviour transparent for grid operators.

SHAPXAIExplainabilityPower SystemsFrequency StabilityPythonSeaborn

Artificial Neural Networks (AM41ANA)

Aston University, M.S. Data Science · Module Coursework

Deep Learning
1

U1-U2 · Fundamentals & Deep Feed-Forward Networks

U1 built up from biological neurons to perceptrons, activation functions, and the universal approximation theorem. U2 went deeper into FFNs: vanishing gradients, Xavier/He initialisation, batch normalisation, dropout, and modern optimisers (Adam, SGD with momentum).

PerceptronsBackpropagationVanishing GradientBatch NormDropoutAdamPyTorch
2

U3-U5 · CNNs, Sequence Models & Advanced Topics

U3 covered CNNs, pooling, and key architectures (LeNet, VGG, ResNet) with transfer learning. U4 introduced RNNs, LSTMs, GRUs, and seq2seq. U5 tackled the Transformer: self-attention, positional encoding, BERT pre-training, GPT generation, and LLM fine-tuning trade-offs.

CNNsResNetLSTMsGRUsTransformersSelf-AttentionBERTGPTLLMs

Data Mining (CS4850 / AM41UDA)

Aston University, M.S. Data Science · Module Coursework

Applied MLGitHub Repo
1

Pipeline: EDA, Preprocessing and Grouped CV

Built a full data mining pipeline for epitope prediction on Trypanosoma cruzi peptide sequences (binary classification). Performed EDA on class-imbalanced data, applied feature scaling and SMOTE, then used GroupKFold cross-validation with the Info_group column to prevent data leakage across protein families.

PandasSMOTEGroupKFoldScikit-learnEDAClass ImbalanceData Leakage Prevention
2

Model Comparison and Holdout Predictions

Trained and compared Random Forest, SVM, Logistic Regression, and Gradient Boosting classifiers using balanced accuracy as the primary metric. Selected the best model, generated predictions on an unseen holdout set, and submitted results as a reproducible notebook, PDF report, and predictions CSV.

Random ForestSVMGradient BoostingBalanced AccuracyReproducibilityPython

Data Science Programming (AM41DP)

Aston University, M.S. Data Science · Module Coursework

ProgrammingGitHub Repo
1

Task 1: Battleships Game Solver

Implemented a Battleships/Yubotu solver in Python on a 10x10 grid. Built ship placement validation, hit/miss tracking, win condition logic, and an AI guessing strategy using checkerboard search patterns combined with targeted hunt mode after a confirmed hit.

PythonAlgorithmsOOPGame LogicAI StrategyJupyter
2

Task 2: Video Game Sales Analysis and SQL

Full data science pipeline on the VGSales dataset: cleaning, EDA, feature engineering (decade, regional share), and statistical testing (t-test, ANOVA). Built and compared Linear Regression and Random Forest models for global sales prediction. Submitted SQL queries separately covering aggregation, filtering, and window functions.

PandasScikit-learnSeabornSciPySQLFeature EngineeringRegression

Statistical Machine Learning (CS4730A / AM41MLA)

Aston University, M.S. Data Science · Module Coursework

Machine Learning
1

CW1 (20%): Regression Analysis and MLP Learning Rate

Investigated a 5-feature dataset to determine classification vs. regression task type and assessed linear vs. non-linear model fit with evidence. Written a technical email to a colleague explaining learning rate in MLP context, its effect on convergence and generalisation, and practical guidance on selection without prior dataset knowledge.

RegressionMLPLearning RateScikit-learnPythonModel Selection
2

CW2 (80%): Clustering, Sparse PCA and Osteoporosis Classification

Applied k-means and GMM with principled cluster selection to a 450-point dataset. Used Sparse PCA on New Delhi air quality data before and after standardisation, comparing component sparsity. Designed and compared 3+ classifiers for osteoporosis risk prediction (5,000 patients), integrating medical literature on Vitamin D, Calcium, and alcohol thresholds into preprocessing decisions.

K-MeansGMMSparse PCAStandardScalerClassificationImbalanced DataMedical ML

Network Science (AM41NS)

Aston University, M.S. Data Science · Module Coursework

AdvancedGitHub Repo
1

Part A: Email Network Analysis

Community detection on a real European email network using the Girvan-Newman algorithm. Computed modularity scores, degree distributions, and clustering coefficients to identify organisational clusters.

NetworkXGirvan-NewmanModularityGraph TheoryMatplotlib
2

Part B: Marvel Universe Epidemic Modeling

Epidemic modelling on a 6,486-node Marvel character network. Computed the critical spreading threshold (λc ≈ 0.78%) via Molloy-Reed, and ran stochastic SIR simulations to analyse phase transitions from local outbreaks to full epidemics.

SIR ModelScale-Free NetworksMolloy-ReedNumPySciPy

Probabilistic Modelling (AM41PBA)

Aston University, M.S. Data Science · Module Coursework

Statistics & MLGitHub Repo
1

CW1: MLE & Bayesian Classification

Analytically derived MLEs for Exponential and Rectified Gaussian distributions on 400 call-centre records, selecting the best fit to flag outliers. Extended to a Bayesian gemstone classifier, deriving the optimal decision threshold mt analytically, including asymptotic and asymmetric cost cases.

MLEExponential Dist.Gaussian Dist.Bayesian Decision TheoryOutlier DetectionPythonSciPy
2

CW2: Gaussian Mixture Models & EM Algorithm

Fitted GMMs (K = 3, 4, 5) to penguin culmen-depth and weight data via EM. Plotted 2D distributions and log-likelihood convergence per K, then used BIC/AIC to select the optimal number of species and discussed why results may diverge from ornithologists' expectation of 3.

GMMEM AlgorithmBIC / AICScikit-learnMatplotlibNumPyModel Selection

Let's Work Together

Have a project in mind or looking for a full stack lead? Drop me a message.