Ai Data Centre Operations Manager

Details of the offer

Overview: Our client is seeking an experienced AI Cloud Data Centre Operations Manager with expertise in GPU architecture to join their dynamic team. In this role, you will be responsible for overseeing and optimizing the operations of the company's AI-focused cloud data centers, ensuring efficient resource management, uptime, and scalability of GPU-intensive systems. Your knowledge in GPU architecture and data center management will be key to optimizing AI workloads and supporting mission-critical applications.
Key Responsibilities: Data Center Operations Management: Lead and manage day-to-day operations of AI-focused data centers, ensuring optimal performance, availability, and reliability of all hardware and software systems.GPU Resource Optimization: Develop strategies for effective utilization of GPU resources in alignment with business and operational goals. Monitor and optimize GPU workloads to maximize performance and cost-efficiency.Infrastructure Planning and Scaling: Collaborate with infrastructure and development teams to plan, deploy, and scale data center resources to support AI workloads, considering current and future requirements.Performance Monitoring & Troubleshooting: Implement monitoring tools and dashboards for real-time analysis of GPU and overall data center performance. Troubleshoot and resolve issues to maintain high availability and minimal downtime.System Upgrades & Maintenance: Schedule and oversee hardware and software upgrades, including GPU infrastructure, ensuring compatibility and system optimization.Security & Compliance: Ensure data center operations meet industry standards for security and regulatory compliance, implementing best practices for data protection and cybersecurity.Vendor and Stakeholder Management: Engage with vendors for hardware procurement and support, and collaborate with cross-functional teams for project planning and execution.Key Qualifications: Education: Bachelor's degree in Computer Science, Engineering, or a related field; Master's degree preferred.Experience: 5+ years of experience in data center operations, with a focus on AI or cloud environments. Experience with managing GPU-intensive systems is essential.Certifications in Data Center Management or cloud platforms.Technical Expertise:
Strong understanding of GPU architecture and experience optimizing workloads for AI and ML applications.Proficiency in data center infrastructure management (DCIM) tools and monitoring systems.Knowledge of cloud computing platforms (AWS, Google Cloud, Azure) and hybrid cloud environments.Desired Skills:
Strong problem-solving and analytical skills, especially in troubleshooting data center and GPU performance issues.Excellent communication and leadership skills, with the ability to work cross-functionally and drive projects to completion.Familiarity with security standards and regulatory compliance requirements for data center operations. Interested applicants, please send your latest resume to:
Jacqueline Ng  [email protected] We regret that only shortlisted candidates will be notified.


Nominal Salary: To be agreed

Source: Grabsjobs_Co

Requirements

Hr Executive (Ulu Tiram, Johor)

Responsibilities: Provide counseling on policies andprocedures.Coordinate in the performancereview procedures (e.g. quarterly/annualevaluations).Maintain emp...


Ipe Switchboard Engineering Sdn Bhd - Johor

Published a month ago

Technician

Job Descriptions QUALIFICATION: DIPLOMA (MECHANICAL OR ELECTRICAL) Responsibilities : · Offer service and customer support during field visits or dispatches....


Newlong Malaysia Sdn. Bhd. (Johor Bahru) - Johor

Published a month ago

Technical Service Executive Technician (Hpac) Junior Technician (Hpac)

Job Description: Service, maintenance and repair duties of chillersPerform trouble-shooting maintenance & repair duties of chillers & HPACReport problems to ...


Chat Union Climaveneta Co. - Johor

Published a month ago

Technician, Operations

Job Description: Accountable for all field preparation, execution, and monitoring of product movement operations at the terminal and jetties. Such as ship lo...


Pengerang Terminals Sdn Bhd - Johor

Published a month ago

Built at: 2024-11-22T03:59:39.291Z