Senior Compute Kernel Architect, GPU Power
Mentions using 'vibe coding' for scripting and automation of power characterization workflows; emphasizes rapid scripting and automation for power experiments.
About the Role
Senior role to design and implement CUDA kernels and power-stress workloads that deliberately push GPU power and current draw to validate Power Delivery Network (PDN) behavior and inform GPU power architecture. Partner with hardware architects and silicon validation teams to analyze power vs. performance trade-offs and ensure voltage stability across GPU families.
Job Description
Role
NVIDIA is hiring a Senior Compute Kernel Performance Architect focused on GPU power. The role centers on writing, profiling, and analyzing CUDA kernels with the explicit goal of exercising and characterizing GPU power and current draw, and working with hardware teams to validate PDN and di/dt behavior from pre-silicon through post-silicon bringup.
Key Responsibilities
- Design and develop CUDA kernels and stress workloads aimed at worst-case current draw across compute, memory, and I/O subsystems.
- Build and maintain a library of power stress microbenchmarks that sweep power profiles across functional units (tensor cores, memory controllers, I/O interfaces).
- Collaborate with hardware power architects and circuit designers to validate PDN assumptions, di/dt specs, and identify weak points.
- Analyze trade-offs between kernel throughput, power efficiency, and voltage stability; feed findings into GPU architecture decisions.
- Partner across GPU architecture, silicon validation, and power architecture teams to align power stress methodologies from simulation to post-silicon testing.
Requirements
- MS or PhD in Computer Science, Electrical Engineering, Computer Engineering, or equivalent experience.
- 5+ years of experience in GPU kernel development, CUDA programming, or high-performance computing.
- Strong CUDA and C++ programming skills; experience optimizing kernels at the assembly or PTX level.
- Experience with GPU performance profiling tools such as Nsight Compute, Nsight Systems, nvprof, or equivalents.
- Solid understanding of GPU architecture (SMs, memory hierarchy, power states) and how these map to current draw profiles.
- Working knowledge of Power Delivery Networks (PDNs), package inductance, decoupling, and their impact on voltage droop/overshoot.
- Conceptual understanding of di/dt and how software workloads can control or stress current transitions.
- Strong Python skills for scripting, data analysis, and automation of power characterization workflows.
- Excellent communication skills and comfort working across hardware and software disciplines.
Preferred / Ways to Stand Out
- Hands-on experience writing GPU power stress microbenchmarks targeting worst-case power on specific functional units.
- Direct post-silicon power characterization experience measuring VDD droop, di/dt slew rates, and transient response with lab instruments.
- Experience with DVFS, AVFS, and noise mitigation techniques and their interactions with kernel behavior.
- Knowledge of PDN impedance targets across die, package, and board and mapping resonance frequencies to droop signatures.
Team & Impact
This team influences NVIDIA’s GPU power delivery and performance stack, working closely with Compute Architecture, Silicon Solutions, Power Architecture, and framework teams to shape future GPU silicon and software behavior.
Compensation
Base salary ranges listed in the posting:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
Candidates are also eligible for equity and benefits.