Senior Datacenter Resiliency Architect
Company : TieTalent
Location : Santa Clara, CA, 95053
Posted Date : 18 November 2025
Job Type : Other
Category : Architecture
Occupation : Architect
Job Details
Join to apply for the Senior Datacenter Resiliency Architect role at TieTalent
We are seeking a Senior Datacenter Resiliency (RAS) Architect to support the development and validation of GPU hardware and software resiliency features. You will be a key member of a team of innovators, challenging the status quo and pushing beyond boundaries, with impact on the industry’s leading Datacenter GPUs and SOCs powering AI and HPC products.
What you’ll be doing
- Architect hardware and software resiliency features to improve system Reliability, Availability, Serviceability (RAS), and performance in the Datacenter.
- Model and analyze RAS metrics (e.g., Failures in Time for permanent and transient errors, Availability from GPU to Rack to Datacenter); use models to identify gaps and drive RAS improvements.
- Collaborate with architects, unit designers, and software engineers to ensure alignment of verification requirements.
- Develop and implement comprehensive architecture verification test plans for resiliency features.
- Execute Architecture Test Plan by developing test content and enabling, running, and debugging tests on architecture models; support test debug on RTL, emulation, and silicon.
- Run simulations to analyze Architectural Vulnerability Factor and liveness of on-die memory, flip-flops, and latches.
- Develop CUDA software diagnostics kernels to run on clusters of NVIDIA GPUs to identify hardware issues.
- Develop and automate fault models to simulate various fault types (e.g., transient faults, stuck-at faults) in gate-level netlists, RTL, architectural models, silicon, and other environments.
What we need to see
- Master’s or PhD in Computer Engineering, Electrical Engineering, or closely related field, or equivalent experience.
- At least 5+ years of relevant experience.
- Familiarity with GPU and networking architectures, computer architecture basics (caches, coherence, buses, DMA), and machine learning/deep learning concepts.
- Strong knowledge and experience in GPU hardware architecture or RAS features, or both.
- Proficiency in developing architecture models.
- Scripting and automation with Python or similar; proficiency in C/C++.
- Excellent interpersonal skills and ability to collaborate with on-site and remote teams; strong debugging and analytical skills; self-driven and results oriented.
- Experience with resiliency and datacenter RAS or Verilog/SystemVerilog RTL simulations and debugging; ability to set up test benches and integrate components is a plus.
- Programming with CUDA is a plus.
Company/role notes
NVIDIA’s work spans high-performance computing and AI computing—roles involve building resilient, high-availability computing platforms for AI, HPC, and data center workloads. NVIDIA is an equal opportunity employer; we do not discriminate on protected characteristics.
#J-18808-LjbffrTrending Searches in Santa Clara, CA
Other Jobs You May Like
Senior Product Leader, SVP (Enterprise Transformation Enablement) - Hybrid
Company : M&T Bank
Location : Bridgeport, CT
Senior Portfolio Manager (Strategic Enterprise Initiatives) - Hybrid
Company : M&T Bank
Location : Buffalo, NY
Senior Manager, Global Trade Compliance
Company : Macpower Digital Assets Edge
Location : Salt Lake City, UT
Senior Buyer - Medical/Healthcare industry
Company : Macpower Digital Assets Edge
Location : Irving, TX
Senior Internal HR Consultant / HR Project Manager - Business Transformation Support
Company : Macpower Digital Assets Edge
Location : New York, NY
Senior Homecare Marketing Business Analyst
Company : Macpower Digital Assets Edge
Location : Atlanta, GA
Senior Manager, People Services - Hybrid Remote(OR/WA only)
Company : Legacy Health
Location : Portland, OR
Top searches
Employment opportunities at TieTalent
- TieTalent jobs near me Santa Clara, CA
- TieTalent jobs hiring near me Santa Clara, CA
- TieTalent jobs near Santa Clara, CA
- TieTalent jobs hiring near me
- TieTalent openings near me
- TieTalent jobs near me in Santa Clara, CA
- TieTalent jobs hiring in Santa Clara, CA
- Employment opportunities near me
- Job openings near me
- Jobs hiring immediately