Research Student · IISc Bangalore

Mumuksh Tayal

MTech (Research) student at the Robert Bosch Centre for Cyber-Physical Systems, Indian Institute of Science, Bangalore. My research focuses on solving Constrained Optimization problems using Offline Reinforcement Learning.


Background


I'm a research student at the Indian Institute of Science (IISc), Bangalore, working at the Robert Bosch Centre for Cyber-Physical Systems (RBCCPS). My research lies at the intersection of Reinforcement Learning, Optimization, and Safety — specifically on developing frameworks for safe and performant Offline RL that can handle hard, state-wise safety constraints.

Prior to IISc, I completed my Engineering from IIT Gandhinagar where I developed strong interest in Computers and systems. I've also worked as an R&D Intern at SanDisk (Western Digital), where I contributed to optimizing LLM inference on edge devices.

Research Papers


EpiFlow: Epigraph-Guided Flow Matching for Safe and Performant Offline RL ICML 2026 · ICLR VerifAI Workshop 2026

M. Tayal, M. Tayal. Devised an epigraph reformulation-based value function guided Offline RL framework that co-optimizes agent policy for safety and performance, maximizing reward while satisfying state-wise hard safety constraints at each step.

Submitted ICLR Workshop Oral Offline RL Flow Matching Safety
Read Paper ↗
V-OCBF: Value-Guided Offline Control Barrier Functions TMLR · AAAI 2026

M. Tayal, M. Tayal, S. Kolathaya, R. Prakash. A value-guided framework that learns actuation-aware, infinite-horizon safe Neural Control Barrier Functions from offline demonstrations — producing deployable safety filters that satisfy hard, state-wise constraints.

TMLR Submitted AAAI Oral Control Barrier Functions Safety Filters
Read Paper ↗
RISE: Robust Imitation through Stochastic Encodings IROS 2025

M. Tayal, M. Tayal, R. Prakash. A variational, parameter-conditioned imitation-learning framework that encodes noisy/stochastic obstacle and goal sensor readings into a probabilistic latent space to improve robustness of offline-learned models.

LeaPRiDE Workshop Imitation Learning Variational Inference
Read Paper ↗
Multi-Instance DNN Inferencing on Heterogeneous Edge Accelerators HiPC 2024

M. Tayal, Y. Simmhan. Investigated the unexplored paradigm of heterogeneous mobile accelerators, evaluating multi-instance DNN inference performance on NVIDIA Jetson AGX Orin to understand resource congestion between CUDA cores, Tensor Cores, and DLAs under concurrent workloads.

Best Student Paper Edge Computing DNN Inference HPC
Read Paper ↗

Where I've Worked


MTech (Research) Student Aug 2024 — Present

RBCCPS, Indian Institute of Science, Bangalore

Conducting research on Constrained Optimization using Offline Reinforcement Learning under Prof. Ravi Prakash. Developing frameworks (EpiFlow, V-OCBF, RISE) for safe and performant offline RL with hard safety constraints.

Teaching Assistant — Imitation Learning for Robotics Jan 2026 — Present

Indian Institute of Science, Bangalore

Tutoring the course on topics covering Offline Reinforcement Learning. Collaborating with 2 other TAs on designing problem statements and course curriculum slides.

Research & Development Intern Feb 2024 — Jun 2024

SanDisk (Western Digital), Bangalore

Worked on an end-to-end software framework for enhanced inference latencies of LLMs on consumer-grade edge devices — resulting in a U.S. Patent filing. Achieved over 60% reduction in latency associated with loading model weights. Worked with low-level HPC toolkits including SPDK for kernel bypass, and implemented deep learning architecture for predictive memory fetching across model components.

Research Student — HPC & Edge ML Aug 2024 — Dec 2024

CDS, Indian Institute of Science, Bangalore

Under Prof. Yogesh Simmhan, investigated heterogeneous mobile accelerators and explored performance behavior of multi-instance DNN inferencing on NVIDIA Jetson AGX Orin. Awarded Best Student Paper Presentation at HiPC 2024.

Selected Work


Safe & Robust Offline Reinforcement Learning Jan 2025 — Present

Research under Prof. Ravi Prakash at RBCCPS, IISc. Devised an epigraph reformulation-based Offline RL framework (EpiFlow) for co-optimizing safety and performance. Developed V-OCBF for learning deployable safety filters from offline data. Proposed RISE — a variational imitation-learning framework for robust policy learning under sensor noise.

PyTorch JAX Offline RL Safety Flow Matching
Heterogeneous DNN Inference on Edge Platforms Aug — Dec 2024

Research under Prof. Yogesh Simmhan at CDS, IISc. Evaluated multi-instance DNN inferencing on NVIDIA Jetson AGX Orin, exploring resource congestion between co-located accelerators — CUDA Cores, Tensor Cores, and DLAs — under concurrent workloads. Motivated intelligent workload-aware load-allocation frameworks.

Python NVIDIA Jetson TensorRT Edge ML
Fixed Parameter Tractability for Vertex Cover Jan — Apr 2024

Project under Prof. Balagopal Komarath at IIT Gandhinagar. Contributed a novel algorithm to the SageMath repository to reduce time complexity of finding Vertex Cover size. Developed a fixed-parameter tractable algorithm that attempts to solve the Vertex Cover problem in polynomial time for a given parameter k.

Python SageMath Graph Theory FPT Algorithms
Online Judge Platform — IIT Gandhinagar Jan — Apr 2024

Full-stack online judge for competitive programming under Prof. Neeldhara Misra. Built with Golang/GoFiber backend and ReactJS+Vite frontend. Designed a separate "judge server" for containerized, secure user code execution using Docker, with parallel submission processing via goroutines and channels.

Golang React Docker GoFiber

Technologies & Expertise


Languages

  • Python (NumPy, Pandas)
  • Golang
  • C / C++
  • JavaScript

ML & Research

  • PyTorch
  • JAX
  • Offline RL / Imitation Learning
  • Control Barrier Functions

Systems & Tools

  • Git / GitHub
  • Linux
  • Docker
  • xNVMe / SPDK

Coursework

  • Machine Learning
  • Bayesian Learning
  • Optimization Methods
  • Control Theory

Get in Touch


I'm always open to discussing research collaborations, new ideas in safe RL and constrained optimization, or opportunities to contribute to impactful work.