Mumuksh Tayal

01 — About

Background

I'm a research student at the Indian Institute of Science (IISc), Bangalore, working at the Robert Bosch Centre for Cyber-Physical Systems (RBCCPS). My research lies at the intersection of Reinforcement Learning, Optimization, and Safety — specifically on developing frameworks for safe and performant Offline RL that can handle hard, state-wise safety constraints.

Prior to IISc, I completed my Engineering from IIT Gandhinagar where I developed strong interest in Computers and systems. I've also worked as an R&D Intern at SanDisk (Western Digital), where I contributed to optimizing LLM inference on edge devices.

02 — Publications

Research Papers

Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies Arxiv

M. Tayal, M. Tayal, R. Prakash. Proposed a safe offline RL framework to learn a single shot optimal policy for Safety and Performance co-optimization. This new framework bypasses need of using Rejection Sampling at inference, a prevalent approach in existing Safe Offline RL frameworks till now.

Under Review Safe Offline RL Flow Matching

Read Paper ↗

EpiFlow: Epigraph-Guided Flow Matching for Safe and Performant Offline RL ICLR VerifAI Workshop 2026

M. Tayal, M. Tayal. Devised an epigraph reformulation-based value function guided Offline RL framework that co-optimizes agent policy for safety and performance, maximizing reward while satisfying state-wise hard safety constraints at each step.

ICLR Workshop Oral Offline RL Flow Matching Safety

Read Paper ↗

V-OCBF: Value-Guided Offline Control Barrier Functions TMLR 2026

M. Tayal, M. Tayal, S. Kolathaya, R. Prakash. A value-guided framework that learns actuation-aware, infinite-horizon safe Neural Control Barrier Functions from offline demonstrations — producing deployable safety filters that satisfy hard, state-wise constraints.

TMLR Accepted AAAI Workshop Oral Control Barrier Functions Safety Filters

Read Paper ↗

RISE: Robust Imitation through Stochastic Encodings IROS 2025

M. Tayal, M. Tayal, R. Prakash. A variational, parameter-conditioned imitation learning framework that encodes noisy/stochastic obstacle and goal sensor readings into a probabilistic latent space to improve robustness of offline-learned models.

LeaPRiDE Workshop Imitation Learning Variational Inference

Read Paper ↗

Multi-Instance DNN Inferencing on Heterogeneous Edge Accelerators HiPC 2024

M. Tayal, Y. Simmhan. Investigated the unexplored paradigm of heterogeneous mobile accelerators, evaluating multi-instance DNN inference performance on NVIDIA Jetson AGX Orin to understand resource congestion between CUDA cores, Tensor Cores, and DLAs under concurrent workloads.

Best Student Paper Edge Computing DNN Inference HPC

Read Paper ↗

03 — Experience

Where I've Worked

MTech (Research) Student Aug 2024 — Present

RBCCPS, Indian Institute of Science, Bangalore

Conducting research on Constrained Optimization using Offline Reinforcement Learning under Prof. Ravi Prakash. Developing frameworks (EpiFlow, V-OCBF, RISE) for safe and performant offline RL with hard safety constraints.

Teaching Assistant — Imitation Learning for Robotics Jan 2026 — Present

Indian Institute of Science, Bangalore

Tutoring the course on topics covering Offline Reinforcement Learning. Collaborating with 2 other TAs on designing problem statements and course curriculum slides.

Research & Development Intern Feb 2024 — Jun 2024

SanDisk (Western Digital), Bangalore

Worked on an end-to-end software framework for enhanced inference latencies of LLMs on consumer-grade edge devices — resulting in a U.S. Patent filing. Achieved over 60% reduction in latency associated with loading model weights. Worked with low-level HPC toolkits including SPDK for kernel bypass, and implemented deep learning architecture for predictive memory fetching across model components.

Research Student — HPC & Edge ML Aug 2024 — Dec 2024

CDS, Indian Institute of Science, Bangalore

Under Prof. Yogesh Simmhan, investigated heterogeneous mobile accelerators and explored performance behavior of multi-instance DNN inferencing on NVIDIA Jetson AGX Orin. Awarded Best Student Paper Presentation at HiPC 2024.

04 — Projects

Selected Work

Safe & Robust Offline Reinforcement Learning Jan 2025 — Present

Research under Prof. Ravi Prakash at RBCCPS, IISc. Devised an epigraph reformulation-based Offline RL framework (EpiFlow) for co-optimizing safety and performance. Developed V-OCBF for learning deployable safety filters from offline data. Proposed RISE — a variational imitation-learning framework for robust policy learning under sensor noise.

PyTorch JAX Offline RL Safety Flow Matching

Heterogeneous DNN Inference on Edge Platforms Aug — Dec 2024

Research under Prof. Yogesh Simmhan at CDS, IISc. Evaluated multi-instance DNN inferencing on NVIDIA Jetson AGX Orin, exploring resource congestion between co-located accelerators — CUDA Cores, Tensor Cores, and DLAs — under concurrent workloads. Motivated intelligent workload-aware load-allocation frameworks.

Python NVIDIA Jetson TensorRT Edge ML

Fixed Parameter Tractability for Vertex Cover Jan — Apr 2024

Project under Prof. Balagopal Komarath at IIT Gandhinagar. Contributed a novel algorithm to the SageMath repository to reduce time complexity of finding Vertex Cover size. Developed a fixed-parameter tractable algorithm that attempts to solve the Vertex Cover problem in polynomial time for a given parameter k.

Python SageMath Graph Theory FPT Algorithms

Online Judge Platform — IIT Gandhinagar Jan — Apr 2024

Full-stack online judge for competitive programming under Prof. Neeldhara Misra. Built with Golang/GoFiber backend and ReactJS+Vite frontend. Designed a separate "judge server" for containerized, secure user code execution using Docker, with parallel submission processing via goroutines and channels.

Golang React Docker GoFiber

05 — Skills

Technologies & Expertise

Languages

Python (NumPy, Pandas)
Golang
C / C++
JavaScript

ML & Research

PyTorch
JAX
Offline RL / Imitation Learning
Control Barrier Functions

Systems & Tools

Git / GitHub
Linux
Docker
xNVMe / SPDK

Coursework

Machine Learning
Bayesian Learning
Optimization Methods
Control Theory

06 — Contact

Get in Touch

I'm always open to discussing research collaborations, new ideas in safe RL and constrained optimization, or opportunities to contribute to impactful work.

Email GitHub LinkedIn Google Scholar