Operation & Maintenance Development Engineer (Sre)

Operation & Maintenance Development Engineer (Sre)
Company:

Time'S Group


Details of the offer

Operation & Maintenance Development Engineer (SRE) Networks & Systems Administration (Information & Communication Technology)
Aethir is the only Enterprise-grade AI-focused GPU-as-a-service provider in the market. Its decentralized cloud computing infrastructure allows GPU providers (containers) to meet Enterprise clients who need powerful GPU chips for professional AI/ML tasks. Thanks to a constantly growing network of over 40,000 top-shelf GPUs, including 3,000 NVIDIA H100s, Aethir is able to provide enterprise-grade GPU computing wherever it's needed, at scale.
Backed by leading Web3 investors like Framework Ventures, Merit Circle, Hashkey, Animoca Brands, Sanctor Capital, Infinity Ventures Crypto (IVC), and others, with over $130M in funds raised for the ecosystem, Aethir is paving the way for the future of decentralized computing.
We are looking for an operations and maintenance development engineer (SRE) to join our new headquarters in Kuala Lumpur, Malaysia, who will play a critical role in monitoring, troubleshooting, and optimizing our production system to ensure the highest levels of performance and stability for our AI and gaming customers worldwide.
Aethir is the only Enterprise-grade AI-focused GPU-as-a-service provider in the market. Its decentralized cloud computing infrastructure allows GPU providers (containers) to meet Enterprise clients who need powerful GPU chips for professional AI/ML tasks. Thanks to a constantly growing network of over 40,000 top-shelf GPUs, including 3,000 NVIDIA H100s, Aethir is able to provide enterprise-grade GPU computing wherever it's needed, at scale.
Backed by leading Web3 investors like Framework Ventures, Merit Circle, Hashkey, Animoca Brands, Sanctor Capital, Infinity Ventures Crypto (IVC), and others, with over $130M in funds raised for the ecosystem, Aethir is paving the way for the future of decentralized computing.
We are looking for an operations and maintenance development engineer (SRE) to join our new headquarters in Kuala Lumpur, Malaysia, who will play a critical role in monitoring, troubleshooting, and optimizing our production system to ensure the highest levels of performance and stability for our AI and gaming customers worldwide.
Responsibilities Monitor, Review, and Respond to Faults: Take on the responsibility of monitoring, reviewing, responding to faults, troubleshooting, resolving, and subsequently optimizing the production system. System Architecture and Performance: Continuously monitor and review the system architecture, process logic, system performance, stability, and other technical areas and indicators to ensure their rationality. Coordination with Business Team: Drive the business team in resolving any issues related to operations and maintenance. Production Failure Response: Respond promptly to production failures, acting as the overall coordinator for resolution. Collaborative Problem-Solving: Organize relevant R&D, operations and maintenance, and product teams to collaboratively investigate and resolve problems. Failure Response Time: Responsible for the failure response time and resolution time, ensuring timely resolution of issues. Case Studies and Optimization: Conduct case studies on production issues and follow up with optimizations to improve system performance and stability. Documentation: Maintain comprehensive documentation of system architecture, processes, and troubleshooting procedures. Continuous Improvement: Identify areas for improvement in the operations and maintenance processes and implement necessary changes. Requirements Bachelor's degree in Computer Science, Engineering, or related field. Experience in operations and maintenance development, preferably in a cloud computing or AI-focused environment. Strong understanding of system architecture, performance monitoring, and troubleshooting methodologies. Excellent communication and collaboration skills. Ability to work in a fast-paced, startup environment. Report this job advert This job ad has not been subjected to our hirer verification process. Proceed cautiously and do your own checks before providing any personal information.
What can I earn as a Development Engineer
#J-18808-Ljbffr


Source: Grabsjobs_Co

Requirements

Operation & Maintenance Development Engineer (Sre)
Company:

Time'S Group


11 - Processes & Applications Manager 3

Req ID: 118042Remote Position: NoRegion: AsiaCountry: MalaysiaState/Province: KedahCity: Kulim**General Overview**:**Functional Area**: SCM - Supply Chain Ma...


From Celestica - Malasia

Published a month ago

Hr Support (Kulim)

Perform full life-cycle recruitment activities including advertising, shortlisting, screening, selection and job offeringManage end to end HR activities such...


From Exalumen Technologies Sdn Bhd - Malasia

Published a month ago

E&I Technician (415 A0 Certificate) Kulim Hi-Tech

Location: Kulim-Hi-TechNormal working hoursMust-Have: 415 AO Certification**Purpose of the position-** Responsible for planning and executing preventive and ...


From Kingfisher Recruitment (Malaysia) Sdn.Bhd - Malasia

Published a month ago

Helpdesk (It)

**Requirements**:- Available to work in shift schedule/standby/on call.- Able to work independently, efficiency and strong sense of urgency.- Must be profici...


From Paradigm Energy Sdn Bhd - Malasia

Published a month ago

Built at: 2024-06-29T12:44:35.118Z