Available for consulting & freelance projects

Mukul Kumar

Applied AI Engineer

Specializing in Computer Vision, NLP, and Production ML Systems. Senior Data Scientist with 6+ years building enterprise-grade AI that ships, scales, and delivers measurable impact. MTech AI at BITS Pilani.

6+
Years Building ML Systems
5
Filed Patents
2
Published Research Papers
40+
Cameras Deployed
Scroll

Featured Projects

Deep case studies on production AI systems — from patent-filed document intelligence to real-time edge deployment at scale.

01
Flagship · 4 Patents Filed

Receipt Intelligence System

Document AI · Computer Vision · NLP

End-to-end document intelligence pipeline for automated extraction of structured data from millions of diverse receipt formats across 10+ countries.

Problem

Enterprises needed accurate, scalable extraction from highly variable receipt layouts — different languages, currencies, and merchant formats.

Approach

Combined OCR preprocessing, Graph Attention Networks for layout understanding, and NER pipelines for entity extraction. Optimized with TensorRT for production inference.

PyTorchTensorRTGraph Attention NetworksOCRNERTritonDocker
  • Graph Attention Networks for receipt layout detection
  • Multi-language OCR with custom preprocessing pipeline
  • NER pipeline for key-field extraction
  • TensorRT optimization for production inference
  • Deployed and serving across 10+ countries
94%+
Extraction Accuracy
60%
Latency Reduction
4
Patents Filed
02
Edge AI · Multi-Camera Scale

Real-Time Social Distancing Monitor

Computer Vision · Edge AI · Deployment

Large-scale CCTV monitoring system for automated compliance detection, deployed on 40+ cameras with real-time edge inference and sub-100ms latency.

Problem

Organizations needed reliable, real-time monitoring across distributed camera networks without cloud latency or data-privacy concerns.

Approach

NVIDIA DeepStream pipeline on Jetson edge devices with homography-based distance estimation, spatial zone mapping, and a distributed alert system.

DeepStreamTensorRTYOLONVIDIA JetsonOpenCVCUDAPython
  • Multi-camera real-time processing pipeline
  • Edge deployment on NVIDIA Jetson devices
  • DeepStream for high-throughput inference
  • Custom homography-based distance estimation
  • Distributed alert and spatial zone management
40+
Cameras Deployed
<100ms
Inference Latency
12×
GPU vs CPU Speedup
03
Research · MDPI Biomimetics 2023

Parkinson's Disease Classifier

Healthcare AI · Transformers · Research

Transformer-based classification model for early Parkinson's detection using vocal biomarkers. Achieved 97.2% accuracy, published in MDPI Biomimetics, July 2023.

Problem

Early Parkinson's detection requires expensive clinical assessments. Objective, non-invasive screening using voice biomarkers could democratize access.

Approach

Fine-tuned Transformer architectures on voice recordings, with novel feature engineering from acoustic biomarkers. Evaluated on open benchmarks against traditional ML baselines.

TransformersPyTorchSignal Processingscikit-learnlibrosaPython
  • Transformer architecture for acoustic biomarker time-series
  • Novel feature engineering from complex vocal biomarkers
  • Outperforms traditional ML baselines across all metrics
  • Published in MDPI Biomimetics, July 2023
  • Evaluated on open, public benchmark datasets
97.2%
Classification Accuracy
0.96
F1 Score
MDPI '23
Publication Venue

How I Work

A structured approach that turns vague AI ideas into reliable production systems. Every engagement follows this proven process.

01

Problem Discovery

Deep dive into your business problem. Understanding data availability, constraints, success metrics, and what 'done' looks like.

02

Data Understanding

Audit your data pipeline, quality, and volume. Identify gaps, biases, and opportunities. Define realistic model targets.

03

Rapid Prototyping

Build a working proof-of-concept fast. Validate the core assumption before engineering a full system. Fail fast, learn faster.

04

Model Development

Architect and train the right model for your problem. Rigorous experimentation, evaluation, and ablation to ensure quality.

05

Optimization & Deployment

TensorRT, quantization, pruning — squeeze every millisecond. Package into Docker/Triton. Deploy to your infrastructure.

06

Monitoring & Iteration

Set up drift detection, performance dashboards, and alerting. Continuously improve based on production feedback.

Services

From initial feasibility to production deployment. If your problem involves AI, I can help you build it right.

AI/ML Consulting

Feasibility studies, architecture reviews, and strategic roadmaps to help you make the right AI investments.

Computer Vision Systems

End-to-end CV solutions: detection, segmentation, classification, tracking, and multi-camera pipelines.

OCR & Document Intelligence

Production OCR pipelines, layout analysis, NER, and structured data extraction from any document format.

Custom NLP Pipelines

Named entity recognition, knowledge graphs, coreference resolution, and Transformer fine-tuning for your domain.

MLOps & Inference Optimization

TensorRT, Triton Inference Server, model quantization, and GPU optimization to slash latency and hosting costs.

Generative AI Prototyping

Rapid prototyping with LLMs, diffusion models, and RAG pipelines — from proof-of-concept to production-ready.

Edge AI Deployment

Deploy models on NVIDIA Jetson, mobile devices, and embedded systems. Real-time inference without cloud dependency.

AI Feasibility Studies

Honest assessment of whether AI can solve your problem, at what cost, and with what expected ROI — before you commit.

Not sure if your problem fits? Let's talk.

Get in touch

Technical Depth

Organized by capability — not thrown on a page as a sticker collection.

AI & Machine Learning

PyTorchTensorFlowTransformers (HuggingFace)Diffusion ModelsGraph Neural NetworksReinforcement Learningscikit-learnXGBoost

Computer Vision

OpenCVNVIDIA DeepStreamTensorRTYOLO (v5–v10)Detectron2SAMControlNetStable Diffusion

NLP & Document AI

Named Entity RecognitionKnowledge GraphsCoreference ResolutionOCR PipelinesLangChain / RAGspaCyNLTKDocument Layout Analysis

Infrastructure & MLOps

DockerNVIDIA TritonTF ServingAWS (EC2, S3, SageMaker)GCP (Vertex AI)KubernetesCUDA / cuDNNMLflow
Languages
PythonTypeScriptC++CUDASQLBash

Publications & Patents

2 published research papers and 5 filed patents — original IP from building production AI systems, not theoretical exercises.

Research PaperKDD 2022 Workshop · Washington DC

Graph Attention Networks for Efficient Text Line Detection on Receipt-Layout Documents

Proposed a GAT-based approach to model spatial relationships between text elements in receipts, enabling robust text line detection and layout understanding across diverse receipt formats. Presented at the Document Intelligence Workshop, KDD 2022, Washington DC.

Graph Neural NetworksDocument AIOCRLayout Analysis
Research PaperMDPI Biomimetics · July 2023

A Novel Artificial-Intelligence-Based Approach for Classification of Parkinson's Disease Using Complex and Large Vocal Features

Developed a Transformer-based model for non-invasive early detection of Parkinson's disease using acoustic and vocal biomarkers extracted from speech recordings. Outperforms traditional ML baselines across all evaluation metrics.

TransformersHealthcare AISignal ProcessingPyTorch
PatentUS81259087 · Jun 2022

Methods, System, Apparatus and Articles of Manufacture to Detect Lines on Documents

Patent covering detection and classification of text lines in receipt-layout documents using graph-based spatial reasoning and deep learning inference pipelines.

Document AIGraph Neural NetworksOCRLayout Detection
PatentUS81254014 · Jun 2021

Automatic Document Content Extraction and Decoding

Patent covering the end-to-end pipeline for automated extraction and interpretation of structured information from unstructured document formats using AI and NLP.

Document AINLPOCRInformation Extraction
PatentFiled · Jun 2021

Automated Extraction of Purchased Items from Receipt Images

Patent on extracting individual line items — product description, quantity, and price — from printed receipt images using computer vision and deep learning techniques.

Computer VisionReceipt AIDeep LearningNLP
PatentFiled · Jun 2021

Automated Receipt Decoding: Product Matching and Dictionaries

Patent on automated receipt decoding in DDS, covering product matching algorithms and dictionary-based normalization for structured extraction of receipt content.

NLPProduct MatchingInformation ExtractionDocument AI
Patent202021034381 · Aug 2020

Method and System to Detect Distance Between Entities

Patent covering the multi-camera system for real-time measurement of inter-person distances using homography projection and deep learning-based person detection for social distancing compliance.

Computer VisionEdge AIDeepStreamSafety Systems

Technical Insights

Practical lessons from building production AI systems. No theory — only what actually works (and what doesn't) in the real world.

OCRTensorRTOptimization

Optimizing OCR Pipelines Using TensorRT

How we reduced inference latency by 60% on a production OCR system processing millions of receipts — the exact techniques, trade-offs, and lessons learned.

Coming Soon
8 min read
Computer VisionEdge AIDeepStream

Lessons from Deploying CV Models on 40+ Cameras

What nobody tells you about edge AI at scale. Camera calibration, model drift, hardware failures, and why the deployment phase takes longer than training.

Coming Soon
10 min read
Document AIProduction MLNLP

Why Document AI Fails in Production

The gap between benchmark accuracy and real-world performance in document intelligence. Layout variance, multi-language edge cases, and how to design for them.

Coming Soon
7 min read
GNNsDocument AIPyTorch

Graph Neural Networks for Document Understanding

A practical walkthrough of using GATs to model spatial relationships between text regions — with code, architecture diagrams, and performance results.

Coming Soon
12 min read

Articles launching soon. Follow on LinkedIn for updates.

Let's Work Together

Available for consulting engagements, freelance projects, and full-time opportunities. If you have an AI problem worth solving, let's talk.

Available for consulting & freelance projects

Whether you're building a new AI product, optimizing an existing system, or need an expert second opinion — I'm here to help.

Typical response time: within 24 hours

Or email directly at mukul.kr99@gmail.com