Riya Soni Photo

Riya Soni

DevOps & Cloud Engineer

CKA | CKS | Terraform Associate

Project

Concurrent GPU Utilization in Kubernetes Clusters



About ProjectIcon

About Project

This project addresses the challenge of efficient GPU resource utilization within Kubernetes architecture. Developed a groundbreaking solution that allows multiple applications to concurrently leverage GPU resources. By integrating GPU sharing mechanisms into Kubernetes, we can enhance the scalability and efficiency of GPU-accelerated workloads.

Tech StackIcon

Tech Stack

  • GPU Driver :Facilitates communication between the operating system and the GPU hardware, ensuring seamless resource allocation.
  • NVIDIA Container Runtime (nvidia-docker2) :Enhances containerization by providing compatibility and optimized performance for NVIDIA GPUs.
  • GPU Sharing Scheduler Extender :Custom scheduler extension for Kubernetes that intelligently allocates and manages GPU resources among multiple applications.
  • GPU Sharing Device Plugin :Enables dynamic device plugin registration for shared GPU resources, ensuring efficient utilization across pods.
Key FeaturesIcon

Key Features

  • Enable multiple applications to run GPU-accelerated tasks simultaneously, improving overall system throughput.
  • Efficiently allocate and deallocate GPU resources based on application demand, ensuring optimal resource utilization.