Responsibilities:• Design, implement, and maintain highly available, scalable, and fault-tolerant systems.• Monitor system performance and availability, identifying and addressing issues proactively.• Automate repetitive tasks to improve efficiency and reliability.• Collaborate with other engineers, development teams, or key stakeholders to ensure new features are built with reliability and scalability in mind.• Participate in on-call rotations to provide 24/7 support for critical systems where needed.• Develop and maintain documentation related to system architecture, processes, and procedures.• Conduct root cause analysis for incidents and implement preventive measures.• Continuously improve system reliability by analyzing and optimizing system performance. Requirements:• Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience. • Min 4 years proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role.• Strong understanding of any cloud platforms (AWS, GCP) and containerization (Docker, Kubernetes).• Proficiency in scripting languages (Python, Bash, etc.) and infrastructure as code (Terraform, Ansible).• Experience with several database technologies (PostgreSQL, MariaDB, Redis, etc.).• Experience with monitoring and observability tools (Prometheus, Grafana, ELK Stack, etc.).• Strong problem-solving skills and the ability to work under pressure.• Experience with CI/CD pipelines and tools (Circle CI, Jenkins, GitLab CI, etc.).• Experience with microservices architecture and distributed systems is preferred.• Ability to effectively communicate in Thai and English.อ่านเพิ่มเติม