Available for consulting & freelance projects

Mukul Kumar

Applied AI Engineer

Specializing in Computer Vision, NLP, and Production ML Systems. Senior Data Scientist with 6+ years building enterprise-grade AI that ships, scales, and delivers measurable impact. MTech AI at BITS Pilani.

Years Building ML Systems

Filed Patents

Published Research Papers

40+

Cameras Deployed

View Projects Hire Me ↓ Resume

Scroll

Selected Work

Featured Projects

Deep case studies on production AI systems — from patent-filed document intelligence to real-time edge deployment at scale.

Flagship · 4 Patents Filed

Receipt Intelligence System

Document AI · Computer Vision · NLP

End-to-end document intelligence pipeline for automated extraction of structured data from millions of diverse receipt formats across 10+ countries.

Problem

Enterprises needed accurate, scalable extraction from highly variable receipt layouts — different languages, currencies, and merchant formats.

Approach

Combined OCR preprocessing, Graph Attention Networks for layout understanding, and NER pipelines for entity extraction. Optimized with TensorRT for production inference.

PyTorchTensorRTGraph Attention NetworksOCRNERTritonDocker

Graph Attention Networks for receipt layout detection
Multi-language OCR with custom preprocessing pipeline
NER pipeline for key-field extraction
TensorRT optimization for production inference
Deployed and serving across 10+ countries

94%+

Extraction Accuracy

60%

Latency Reduction

Patents Filed

Edge AI · Multi-Camera Scale

Real-Time Social Distancing Monitor

Computer Vision · Edge AI · Deployment

Large-scale CCTV monitoring system for automated compliance detection, deployed on 40+ cameras with real-time edge inference and sub-100ms latency.

Problem

Organizations needed reliable, real-time monitoring across distributed camera networks without cloud latency or data-privacy concerns.

Approach

NVIDIA DeepStream pipeline on Jetson edge devices with homography-based distance estimation, spatial zone mapping, and a distributed alert system.

DeepStreamTensorRTYOLONVIDIA JetsonOpenCVCUDAPython

Multi-camera real-time processing pipeline
Edge deployment on NVIDIA Jetson devices
DeepStream for high-throughput inference
Custom homography-based distance estimation
Distributed alert and spatial zone management

40+

Cameras Deployed

<100ms

Inference Latency

12×

GPU vs CPU Speedup

Research · MDPI Biomimetics 2023

Parkinson's Disease Classifier

Healthcare AI · Transformers · Research

Transformer-based classification model for early Parkinson's detection using vocal biomarkers. Achieved 97.2% accuracy, published in MDPI Biomimetics, July 2023.

Problem

Early Parkinson's detection requires expensive clinical assessments. Objective, non-invasive screening using voice biomarkers could democratize access.

Approach

Fine-tuned Transformer architectures on voice recordings, with novel feature engineering from acoustic biomarkers. Evaluated on open benchmarks against traditional ML baselines.

TransformersPyTorchSignal Processingscikit-learnlibrosaPython

Transformer architecture for acoustic biomarker time-series
Novel feature engineering from complex vocal biomarkers
Outperforms traditional ML baselines across all metrics
Published in MDPI Biomimetics, July 2023
Evaluated on open, public benchmark datasets

97.2%

Classification Accuracy

0.96

F1 Score

MDPI '23

Publication Venue

Consulting Workflow

How I Work

A structured approach that turns vague AI ideas into reliable production systems. Every engagement follows this proven process.

Problem Discovery

Deep dive into your business problem. Understanding data availability, constraints, success metrics, and what 'done' looks like.

Data Understanding

Audit your data pipeline, quality, and volume. Identify gaps, biases, and opportunities. Define realistic model targets.

Rapid Prototyping

Build a working proof-of-concept fast. Validate the core assumption before engineering a full system. Fail fast, learn faster.

Model Development

Architect and train the right model for your problem. Rigorous experimentation, evaluation, and ablation to ensure quality.

Optimization & Deployment

TensorRT, quantization, pruning — squeeze every millisecond. Package into Docker/Triton. Deploy to your infrastructure.

Monitoring & Iteration

Set up drift detection, performance dashboards, and alerting. Continuously improve based on production feedback.

What I Offer

Services

From initial feasibility to production deployment. If your problem involves AI, I can help you build it right.

AI/ML Consulting

Feasibility studies, architecture reviews, and strategic roadmaps to help you make the right AI investments.

Computer Vision Systems

End-to-end CV solutions: detection, segmentation, classification, tracking, and multi-camera pipelines.

OCR & Document Intelligence

Production OCR pipelines, layout analysis, NER, and structured data extraction from any document format.

Custom NLP Pipelines

Named entity recognition, knowledge graphs, coreference resolution, and Transformer fine-tuning for your domain.

MLOps & Inference Optimization

TensorRT, Triton Inference Server, model quantization, and GPU optimization to slash latency and hosting costs.

Generative AI Prototyping

Rapid prototyping with LLMs, diffusion models, and RAG pipelines — from proof-of-concept to production-ready.

Edge AI Deployment

Deploy models on NVIDIA Jetson, mobile devices, and embedded systems. Real-time inference without cloud dependency.

AI Feasibility Studies

Honest assessment of whether AI can solve your problem, at what cost, and with what expected ROI — before you commit.

Not sure if your problem fits? Let's talk.

Get in touch

Capabilities

Technical Depth

Organized by capability — not thrown on a page as a sticker collection.

AI & Machine Learning

PyTorchTensorFlowTransformers (HuggingFace)Diffusion ModelsGraph Neural NetworksReinforcement Learningscikit-learnXGBoost

Computer Vision

OpenCVNVIDIA DeepStreamTensorRTYOLO (v5–v10)Detectron2SAMControlNetStable Diffusion

NLP & Document AI

Named Entity RecognitionKnowledge GraphsCoreference ResolutionOCR PipelinesLangChain / RAGspaCyNLTKDocument Layout Analysis

Infrastructure & MLOps

DockerNVIDIA TritonTF ServingAWS (EC2, S3, SageMaker)GCP (Vertex AI)KubernetesCUDA / cuDNNMLflow

Languages

PythonTypeScriptC++CUDASQLBash

Research & IP

Publications & Patents

2 published research papers and 5 filed patents — original IP from building production AI systems, not theoretical exercises.

Research PaperKDD 2022 Workshop · Washington DC

Graph Attention Networks for Efficient Text Line Detection on Receipt-Layout Documents

Proposed a GAT-based approach to model spatial relationships between text elements in receipts, enabling robust text line detection and layout understanding across diverse receipt formats. Presented at the Document Intelligence Workshop, KDD 2022, Washington DC.

Graph Neural NetworksDocument AIOCRLayout Analysis

Research PaperMDPI Biomimetics · July 2023

A Novel Artificial-Intelligence-Based Approach for Classification of Parkinson's Disease Using Complex and Large Vocal Features

Developed a Transformer-based model for non-invasive early detection of Parkinson's disease using acoustic and vocal biomarkers extracted from speech recordings. Outperforms traditional ML baselines across all evaluation metrics.

TransformersHealthcare AISignal ProcessingPyTorch

PatentUS81259087 · Jun 2022

Methods, System, Apparatus and Articles of Manufacture to Detect Lines on Documents

Patent covering detection and classification of text lines in receipt-layout documents using graph-based spatial reasoning and deep learning inference pipelines.

Document AIGraph Neural NetworksOCRLayout Detection

PatentUS81254014 · Jun 2021

Automatic Document Content Extraction and Decoding

Patent covering the end-to-end pipeline for automated extraction and interpretation of structured information from unstructured document formats using AI and NLP.

Document AINLPOCRInformation Extraction

PatentFiled · Jun 2021

Automated Extraction of Purchased Items from Receipt Images

Patent on extracting individual line items — product description, quantity, and price — from printed receipt images using computer vision and deep learning techniques.

Computer VisionReceipt AIDeep LearningNLP

PatentFiled · Jun 2021

Automated Receipt Decoding: Product Matching and Dictionaries

Patent on automated receipt decoding in DDS, covering product matching algorithms and dictionary-based normalization for structured extraction of receipt content.

NLPProduct MatchingInformation ExtractionDocument AI

Patent202021034381 · Aug 2020

Method and System to Detect Distance Between Entities

Patent covering the multi-camera system for real-time measurement of inter-person distances using homography projection and deep learning-based person detection for social distancing compliance.

Computer VisionEdge AIDeepStreamSafety Systems

Writing

Technical Insights

Practical lessons from building production AI systems. No theory — only what actually works (and what doesn't) in the real world.

OCRTensorRTOptimization

Optimizing OCR Pipelines Using TensorRT

How we reduced inference latency by 60% on a production OCR system processing millions of receipts — the exact techniques, trade-offs, and lessons learned.

Coming Soon

8 min read

Computer VisionEdge AIDeepStream

Lessons from Deploying CV Models on 40+ Cameras

What nobody tells you about edge AI at scale. Camera calibration, model drift, hardware failures, and why the deployment phase takes longer than training.

Coming Soon

10 min read

Document AIProduction MLNLP

Why Document AI Fails in Production

The gap between benchmark accuracy and real-world performance in document intelligence. Layout variance, multi-language edge cases, and how to design for them.

Coming Soon

7 min read

GNNsDocument AIPyTorch

Graph Neural Networks for Document Understanding

A practical walkthrough of using GATs to model spatial relationships between text regions — with code, architecture diagrams, and performance results.

Coming Soon

12 min read

Articles launching soon. Follow on LinkedIn for updates.

Get In Touch

Let's Work Together

Available for consulting engagements, freelance projects, and full-time opportunities. If you have an AI problem worth solving, let's talk.

Available for consulting & freelance projects

Whether you're building a new AI product, optimizing an existing system, or need an expert second opinion — I'm here to help.

Typical response time: within 24 hours

mukul.kr99@gmail.com

linkedin.com/in/mukulkr

GitHub

github.com/cs-savvy