Engineering Industry systems and shipped builds
Production AI work, founding-engineer projects, and weekend builds that won tracks.
Currently working on
Cross-institutional AI software platform building an industrial copilot for manufacturing. Multi-agent workflows run over live plant-floor data.
- Architected multi-agent workflows backed by 2k to 5k node knowledge graphs per deployment, orchestrated through LangChain. They power KPI prediction and time-series analytics over SCADA, PLC, MQTT, and SQL Server data.
- Shipped 15+ client POCs end-to-end. Embedded with manufacturing teams on tight timelines to scope, build, and deploy AI workflows on live industrial data.
- Built FastAPI services and a provisioning CLI for Dockerized pipelines on AWS and Kubernetes. Grafana reports generate automatically via API-triggered workflows.
- Maintained a shared frontend NPM package to standardize UI components across client apps.
Past work (shipped)
Knowledge Retrieval System for Technical Documents
Capstone Penn State Learning Factory · Morgan Advanced Materials
Offline, citation-grounded RAG system for technical documents. Team capstone through the Penn State Learning Factory, sponsored by Morgan Advanced Materials, deployed on a local GPU workstation at the sponsor's office. The system addresses two parallel problems: time lost to manual document search, and knowledge loss when experienced staff depart. Featured at the Learning Factory Showcase.
- Enforced citation-based grounding on every AI-generated response. Ungrounded outputs are blocked before they reach the user, which is the policy that makes the system safe to deploy on regulated, internal documents.
- Built a hybrid retrieval pipeline fusing FAISS dense search with BM25 via Reciprocal Rank Fusion, then reranked with a cross-encoder for precision.
- Ingestion pipeline handles multiple document formats and produces searchable embeddings. Retrieval engine, API, and UI run as separate components on the local GPU workstation.
- Exposed a FastAPI REST backend with SSE streaming for token-by-token responses, wired to a Chainlit interface for chat-style interaction.
- Designed a retrieval evaluation framework tracking Recall@K, nDCG@K, MRR, and per-model latency benchmarks across configurations.
Fully autonomous fact-checking system that goes beyond true-or-false. Claims are decomposed into atomic sub-claims, evidence is weighted by source credibility, and an adversarial debate triggers when sources disagree. Verdicts are written to an immutable Solana ledger and read aloud as a voice summary.
- Decomposes claims into atomic sub-claims using the HiSS method, then retrieves evidence via Gemini with google_search.
- Scores ~4,000 sources for credibility using the MBFC dataset, so contradictory but low-credibility evidence gets down-weighted instead of just counted.
- Triggers adversarial pro/con debate whenever inter-agent agreement falls below 80%. Outputs a 7-label verdict spanning TRUE through CONFLICTING and UNVERIFIABLE, rather than a flat boolean.
- Writes every verdict as a Solana memo on devnet so the result is permanent and tamper-evident. Generates a voice summary via ElevenLabs TTS.
- Streams pipeline progress in real time over SSE. Backed by Turso (libSQL) for serverless deployment, with SQLite for local development.
AI-personalized sustainability app that delivers one tailored micro-action per day. The app stays transparent about the carbon cost of every AI inference it runs. Built in 12 hours with a small team.
- Onboards users in 90 seconds. Daily actions are tailored to commute distance, diet pattern, the city's live grid carbon intensity, and current weather, drawing on EPA and DEFRA emissions data.
- Curated and scored a knowledge base of 190 actions. Structured around behavioral science frameworks like Fogg's B=MAP and Tiny Habits, with points, streaks, and SDG tracking layered on top.
- Built a Chrome extension and Eco-LLM dashboard that monitor energy (Wh), carbon (gCO2), and water (mL) per Gemini prompt. Semantic caching serves similar queries at zero additional inference cost.
- Projected impact at a hypothetical 100,000 daily users is around 12,000 tonnes of CO2 removed per year, with a typical carbon ROI of 10,000:1 or higher.