Syracuse, NYOpen to 2026 Full-TimePress ⌘K

Data / ML EngineerScalable ETL • MLOps • Deep Learning

I build production-grade data and ML systems: reliable pipelines, observable training, and warehouse-ready outputs—optimized for performance and measurable impact.

Impact
85%+
Agreement vs manual annotations
Forecasting
0%
Improved prediction accuracy
Ops
MLOps
Docker + Kubernetes + Airflow
Warehouse
BQ/PG
BigQuery + Postgres exports
Airflow orchestrationKubernetes ML jobsSpark/HadoopBigQuery/PostgresData contractsModel retrainingMonitoring-readyAirflow orchestrationKubernetes ML jobsSpark/HadoopBigQuery/PostgresData contractsModel retrainingMonitoring-ready
ABOUT_ME

Builder mindset. Production standards.

I like ownership—reliable pipelines, observable training, and warehouse-ready outputs. Clean data contracts and measurable results.

FOCUS
Data-to-AI lifecycle: ingestion → warehouse → orchestration → training → deployment.
STRENGTH
Systems thinking: performance, reliability, reproducibility, and clean interfaces.
STYLE
Fast iteration with high standards: metrics, monitoring, and documented decisions.
WORK_I'M_PROUD_OF

Projects that read like case studies.

Filter by area. Expand for architecture and outcomes.

Mumbai Housing Valuation Engine (MHVE)

Production-grade DE + ML pipeline for real estate arbitrage and fair-value prediction.

DataML
LiveCase Study
PythonSQLSparkPostgreSQLDockerDocker ComposeStreamlitScikit-learn

Agentic Network Security Monitor

Multi-agent threat orchestration with Isolation Forest anomaly detection + stateful memory.

MLMLOps
Case Study
PythonSQLScikit-learnSQLitePandasNumPyMatplotlibJSON Policy Engine

Organoid Brightfield Image Analysis Pipeline

Reproducible training + deployment for phenotype quantification.

MLMLOpsData
Case Study
PythonPyTorchTensorFlowAirflowDockerKubernetesBigQueryPostgres

Geospatial Forecasting at Scale (Spark/Hadoop)

Distributed batch + ML scoring feeding cloud warehouses + BI.

DataML
Case Study
SparkHadoopAirflowPythonBigQueryPostgresTableauLooker

Streaming Telemetry Patterns (Kafka)

Near real-time ingestion patterns with observable outputs.

StreamingData
Case Study
KafkaPythonData ContractsMonitoring

Chronic Kidney Disease Prediction

ML classification with threshold tuning to reduce false positives.

ML
Case Study
PythonScikit-learnPandasNumPy
WORK_EXPERIENCE

Impact-first bullets.

Problem → solution → measurable outcome, with tooling signal.

Research Assistant — Organoid Image Analysis · Syracuse University

Syracuse, NY · Aug 2024 — Present

  • Built end-to-end data + ML pipelines for brightfield organoid analysis (PyTorch/TensorFlow), reaching 85%+ agreement vs manual annotations.
  • Containerized training/inference with Docker; orchestrated production jobs via Kubernetes; improved GPU throughput including multi-GPU workflows.
  • Created reproducible Airflow workflows for scheduled ETL + retraining; exported features to BigQuery/Postgres for analytics and dashboards.
  • Applied PCA + experimental design to quantify phenotypes; documented methods and enforced data integrity across iterations.
PythonPyTorchTensorFlowAirflowDockerKubernetesBigQueryPostgresOpenCV

Software Engineer (Data / ML Pipelines) · Hackveda

Remote · Dec 2023 — Mar 2024

  • Developed geospatial predictive pipelines with Spark/Hadoop; integrated outputs with BigQuery/Postgres for warehousing and analytics.
  • Implemented ML + feature workflows in Dataiku DSS and Python, improving forecast accuracy by 22%; automated batch scoring via Airflow.
  • Built streaming ingestion patterns for near real-time telemetry; surfaced KPIs in Tableau/Looker for stakeholders.
  • Owned data modeling decisions (schemas, partitioning) and introduced CI practices for reliable deployments.
SparkHadoopAirflowDataiku DSSPythonBigQueryPostgresTableauLooker

Software Engineer (Data Engineering) · Phemesoft (IBM Platinum Business Partner)

Remote · May 2023 — Jul 2023

  • Built tracking + analytics workflows in Python/Pandas, improving order processing and delivery efficiency by 14%.
  • Ran Market Basket Analysis on 50,000+ transactions; drove 20% repeat purchases and 15% loyalty improvement.
  • Optimized Tableau dashboards; reduced reporting turnaround to 1 day and improved decision cycles.
  • Improved data quality via schema validation + cross-team troubleshooting.
PythonPandasSQLETLTableau
TECH_STACK

Depth over buzzwords.

Bigger type, cleaner scan, stronger grouping.

Languages
PythonSQLJavaScript
Data Engineering
AirflowSparkHadoopETL/ELTData ModelingData QualityFeature Pipelines
Streaming / Eventing
Kafka patternsNear real-time telemetry
ML/AI
PyTorchTensorFlowScikit-learnOpenCVEvaluationPCA/Stats
Databases / Warehousing
BigQueryPostgreSQLMySQLMongoDBSnowflake
Cloud / DevOps
GCPAWSAzureDockerKubernetesCI/CD
Analytics
TableauLookerDashboards/KPIs
EDUCATION

M.S. in Computer Science · Syracuse University

Graduating May 2026

Relevant coursework
AlgorithmsMachine LearningOperating SystemsComputer Architecture
Achievements
  • Google Professional Data Engineer (Certification).
  • Hackathon 2023 — Ambiora SVKM Mukesh Patel Technology Park.
  • IBM ICE DAY — Technical Poster Competition.
  • Volunteer — Vineyard Church, Syracuse (Logistics & Outreach), Sep 2024 – Present.
LEADERSHIP_COMMUNITY

Leadership & Community

Proof of ownership, coordination, and technical community building.

Organizing Committee — SU Agent-AI Workshop 2026 · Department of Electrical Engineering and Computer Science, Syracuse University

2026

Organized "FROM MODELS TO INTELLIGENT AI AGENTS" workshop featuring academic + industry talks on agentic systems, RL, diffusion models, and environment-driven training.

  • Coordinated speaker logistics and agenda with faculty organizers.
  • Supported execution of sessions and attendee experience end-to-end.
  • Helped facilitate industry + research networking and knowledge-sharing.
GET_IN_TOUCH

Let’s talk.

Best way: email. I reply fast.