Riya Soni

DevOps & Cloud Engineer

CKA | CKS | Terraform Associate

riyasoni5990@gmail.com

RESUME

GitHub

Project

Concurrent GPU Utilization in Kubernetes Clusters

About Project

This project addresses the challenge of efficient GPU resource utilization within Kubernetes architecture. Developed a groundbreaking solution that allows multiple applications to concurrently leverage GPU resources. By integrating GPU sharing mechanisms into Kubernetes, we can enhance the scalability and efficiency of GPU-accelerated workloads.

Tech Stack

GPU Driver :Facilitates communication between the operating system and the GPU hardware, ensuring seamless resource allocation.
NVIDIA Container Runtime (nvidia-docker2) :Enhances containerization by providing compatibility and optimized performance for NVIDIA GPUs.
GPU Sharing Scheduler Extender :Custom scheduler extension for Kubernetes that intelligently allocates and manages GPU resources among multiple applications.
GPU Sharing Device Plugin :Enables dynamic device plugin registration for shared GPU resources, ensuring efficient utilization across pods.

Key Features

Enable multiple applications to run GPU-accelerated tasks simultaneously, improving overall system throughput.
Efficiently allocate and deallocate GPU resources based on application demand, ensuring optimal resource utilization.