New

Technical Program Manager, AI Inferencing

Microsoft
remote work
United States, Washington, Redmond
Dec 28, 2024
OverviewThe Artificial Intelligence (AI) Delivery organization is focused on delivering various platform services to our customers worldwide. This role will require deep partnership across all of Microsoft as these programs are on the leading edge of innovation and include driving initiatives and technology shifts that shape our ongoing commitment to deliver AI platform services rapidly at cloud scale. We are part of Cloud Operations + Innovation (CO+I), the group responsible for one of the world's largest cloud infrastructures, powering all Microsoft online products and services, and supporting Microsoft's "Cloud First" strategy. CO+I is focused on growth, efficiency, and providing reliable experiences to customers and partners worldwide. We are looking for a highly motivated Technical Program Manager, AI Interfacing to play a key role in preparing our AI inferencing portfolio for transition to the next generations of GPUs and cooling technology. In this role, you will be responsible for managing a global portfolio of feasibility assessment projects to establish plans of record for GPU deployments. You will also provide program management and process development support to a variety of special projects related to GPU infrastructure in our datacenters. The ideal candidate will leverage datacenter knowledge, technical program management experience and a strong attention to detail to establish program plans and drive cross-functional team execution. This role is located either in one or all hub locations - Atlanta, GA, Washington, D.C., Redmond, WA, San Antonio, TX or Phoenix, AZ.Relocation support will be provided, and successful candidates must relocate or reside within 50 miles of the hub office location.This role is eligible for hybrid or remote work, up to 100%. ResponsibilitiesAnalysis:Identify complex opportunities and gaps in GPU deployment processes, tools and data structures. Independently perform research, conduct analysis and integrate relevant data to identify complex patterns, generate hypotheses and build plans to change the way we manage our GPU deployments.Product/Service Definition:Work across a variety of teams and stakeholders to design integrated solutions to complex technical needs.Translate the needs of the organization and other teams into program goals and prioritized deliverables based on data insights.Product/Service Development:Contribute to the development of the staging and implementation plan for piloting/release of initiatives in alignment with Objectives and Key Results (OKRs) and Key Performance Indicators (KPIs). Collaborate with stakeholders to monitor progress and adjust as needed.Manage governance programs and processes to ensure that specific performance requirements and standards are met throughout the development lifecycle (e.g., quality, compliance, privacy, security, safety, accessibility)Collaborate with others to track, coordinate, and communicate end-to-end project schedules. Work with others to establish and monitor processes and hold stakeholders accountable for following the established schedule and processes. Track and manage dependencies to enable cohesive, connected user scenarios, and to identify potential risk areas and escalate appropriately. Make adjustments or course corrections when projects are not aligned to schedules or goals.Create relationships to drive orchestration and integration efforts for large and complex cross-functional projects with internal teams and external partners. Validate use-case and scenario outcomes and drive continuous quality improvements to ensure performance targets are being achieved.