DPY
Open to Opportunities

Dipendra P.
Yadav

Senior Data Engineer

Building enterprise-scale data pipelines at Liberty Mutual and shipping live SaaS products as Founder of Queuella LLC.

0+
Years Experience
0+
AWS Services
0
Live SaaS Platforms
0
AWS Certifications
Dipendra P. Yadav

Dipendra P. Yadav

Georgetown, TX

scroll

Senior Data Engineer with 6+ years delivering enterprise-scale pipelines, ETL/ELT workflows, and cloud analytics platforms in regulated financial services. Proven in PySpark, Hadoop, Kafka, and AWS — from raw ingestion to ML-ready feature stores. Founder of Queuella LLC, a live multi-tenant SaaS serving customers across the US, Nepal, and India.

Georgetown, TX · dipen.p.yadav@gmail.com · 737-298-5520

Core Competencies

Enterprise Data Pipeline Design
PySpark / Hadoop / Hive / MapReduce
ETL & ELT Workflow Architecture
AWS Cloud Architecture (9+ services)
Real-Time & Event-Driven Systems
Data Modeling & Warehousing
AI / LLM Integration (Bedrock, OpenAI)
Data Governance & Quality Frameworks
Square API & Third-Party Integration
Full-Stack SaaS Development

Professional Experience

Senior Data Engineer

Liberty Mutual Insurance Company

Jan 2021 – Present

Remote | Austin, TX

  • Architected end-to-end enterprise data pipelines using PySpark, Hadoop, and Hive on AWS EMR, processing multi-TB insurance datasets to power analytics, reporting, and ML workloads.
  • Spearheaded migration of core insurance data systems to AWS (S3, EMR, Glue, Redshift), designing ETL/ELT workflows that reduced data processing errors and testing cycles by 85%.
  • Implemented data quality and governance frameworks — schema validation, null-rate monitoring, row-count reconciliation, and data lineage tracking — ensuring audit compliance in regulated environments.
  • Designed star-schema and dimensional data models supporting Tableau and Power BI dashboards across underwriting, claims, and finance functions.
  • Partnered with data scientists to build ML-ready feature pipelines ensuring point-in-time correctness and feature consistency between training and inference environments.
  • Led and mentored a cross-functional Agile team, driving monthly insurance reports and data extracts with structured stakeholder reporting.
  • Optimized Spark job performance via partition tuning, broadcast joins, data skew mitigation, and strategic caching — reducing runtimes and cluster costs for mission-critical batch workloads.
Queuella — Service Operations SaaS2024 – Present

Georgetown, TX

  • Architected and directed development of a production multi-tenant SaaS platform (Next.js 16 / TypeScript / PostgreSQL) with a native Swift iOS companion app serving paying customers across the US, Nepal, and India.
  • Designed a multi-service AWS backend: S3, SQS, SNS/SES, Cognito (including zero-data-loss migration from Clerk), ECS, Bedrock, and EventBridge.
  • Integrated Square API across five surfaces (Catalog, Team Members, Labor, Locations, Payments) enabling automatic real-time sync of a client's full business profile on integration.
  • Built a native iOS app (Swift, MVVM) with SSE streaming, APNs push notifications, Keychain auth, and deep link routing — published and serving live customers.
  • Delivered full Stripe billing infrastructure: subscriptions, webhooks, usage-based billing, and multi-location pricing with a custom superadmin pricing editor.
  • Established a production-grade testing strategy: Vitest unit suites across 8 domains and 20+ Playwright E2E suites covering auth, kiosk privacy, scheduling, and billing.

Engineering Lead — Business Automation

iBrows Studio | Multi-Location Beauty Brand

iBrows Studio | Multi-Location Beauty Brand2024 – Present

Georgetown, TX

  • Built a fully automated multi-location payroll processing system (Python, AWS Lambda, EventBridge cron) eliminating all manual payroll processing across bi-monthly pay cycles.
  • Engineered a Square API pipeline consuming Labor API (timecards + break deduction), Team Members API, and Payments API — transforming raw data into fully computed payroll using pandas with per-employee rules.
  • Built a timezone-aware timecard anomaly detection engine flagging clock-in/out violations, oversized shifts, and probable auto-clockouts with configurable grace windows.
  • Delivered a three-tier email notification system: branded HTML pay stubs, timecard exception reports, and a full manager summary with payroll totals and outstanding issues.

Data Engineer

Cognizant Technology Solutions

Aug 2020 – Dec 2020

Austin, TX

  • Built a real-time pipeline for high-volume social media stream analysis using Apache Kafka and Spark Streaming, end-to-end from ingestion to output sink.
  • Conducted large-scale COVID-19 data analysis on AWS EMR producing statistical insights and predictive model inputs from multi-source healthcare datasets.
  • Developed Power BI dashboards for call center analytics, integrating multiple data sources and delivering BI reporting for business stakeholders.

Student Research Assistant

Texas Tech University

Apr 2019 – May 2020

Lubbock, TX

  • Automated data extraction and processing pipelines from large-scale system log files using Python and Bash scripting, improving data accuracy and enabling structured analytical workflows.

Technical Skills

Languages

PythonSQLTypeScriptSwiftRSASBashPowerShell

Big Data

PySparkApache SparkHadoopHiveMapReduceKafkaSpark Streaming

Cloud / AWS

S3EMRGlueRedshiftSQSSNSSESCognitoECSLambdaBedrockEventBridgeCost Explorer

Databases

PostgreSQLOracle Autonomous DBMySQLMongoDBCassandra

APIs / Integration

Square APIStripeREST API DesignOpenAI API

AI / LLM

AWS BedrockOpenAI APIGitHub CopilotLLM IntegrationAI-Assisted Engineering

ETL / ELT

Apache AirflowInformaticaTalenddbtpandasEvent-Driven Pipelines

Data Quality

Great ExpectationsSchema ValidationAnomaly DetectionData LineageGovernance Frameworks

ML Frameworks

TensorFlowPyTorchScikit-learnStatistical ModelingPredictive Analytics

BI / Viz

TableauPower BIRechartsOracle Analytics Cloud

DevOps / Other

DockerKubernetesAWS ECSGitAgile/ScrumVitestPlaywright

Education

M

M.S. Data Science

University of Texas at Austin

Aug 2025
B

B.S. Computer Engineering

Texas Tech University

May 2020

Licenses & Certifications

Databricks

Academy Accreditation — Databricks Lakehouse Fundamentals

Issued Oct 2022

UT Austin

Advances in Deep Learning

Issued Jan 2026

UT Austin

Design Principles & Causal Inference

Issued Dec 2024

UT Austin

Data Structures & Algorithms

Issued Dec 2024

UT Austin

Optimization

Issued May 2025

UT Austin

Deep Learning

Issued Aug 2024

UT Austin

Data Science for Health Discovery and Innovation

Issued May 2024

UT Austin

Foundations of Regression and Predictive Modeling

Issued May 2024

edX

Machine Learning

Issued Dec 2021

edX

Probability and Simulation-Based Inference

Issued Nov 2021

Get In Touch

Open to New Opportunities

Available for senior data engineering, staff engineering, and founding engineer roles.