As an SRE at PayPay, we strive towards ensuring high availability and top-level performance so that our users can have flawless and reliable service exceeding expectations.
Considering PayPay's growth, we are looking for experienced SREs who can deliver insights into system bottlenecks and ensure system reliability and scalability, while increasing the number of services that our company offers.
We are looking for individuals who can bring informed and unique viewpoints, enjoy collaborating with a cross-functional team and are actively pushing boundaries to develop reliable and scalable solutions and positive user experiences.
Analyze current technologies used in the company and develop monitoring and notification tools to improve observability and visibility.
Ensure system stability by pre-emptively verifying failure scenarios and implement solutions to reduce MTTR
Develop solutions to improve system performance with a focus on high availability, scalability and resilience
Integrate telemetry and alerting platforms to track and improve reliability of systems
Implement industry best practices for system development, configuration management and system deployment
Ensure seamless flow of information between teams by documenting knowledge gained
Be up to date on modern technologies and trends to advocate for inclusion within products if they add value
Participate in incident management including troubleshooting production issues, driving root cause analysis (RCA) and actively sharing lessons learned to improve system reliability and internal knowledge.
QualificationsExperience troubleshooting, tuning high performance microservice architectures running on Kubernetes and AWS in highly available production environments.
5+ years experience in software development in Python, Java, Go, etc with strong fundamentals in data structures, algorithms, problem solving and complexity analysis.
*During the selection process, you will have a coding challenge.Curious and proactive in finding performance bottlenecks, scalability and resilience problem areas and addressing them.
Experience with observability tools and gathering data.
Database knowledge such as RDS, NoSQL, distributed TiDB, etc.
Excellent communication skills, collaborative and getting things done attitude.
Enjoy taking up a challenge and driving it to conclusion.
Ability to verbally communicate in both English and Japanese.
Preferred Qualifications
Container image management and optimization.
Experience in large distributed system architecture and capacity planning.
Understanding of IaC, automation tools, terraform, cloud formation, etc.
Background in SRE/DevOps concepts and implementation.
Experience in managing monitoring tools like CloudWatch, VictoriaMetrics, Prometheus and reporting with Snowflake and Sigma.
In depth knowledge of web technologies such as CloudFront, Nginx, etc.
Experience in designing, implementing or maintaining disaster recovery strategies and multi-region architecture to ensure high availability, resilience, and business continuity across critical systems.
Business proficiency level in both English and Japanese.Show more Show less
-
SRE(Bilingual)
1ヶ月前
Tokyo PayPay株式会社 ¥1,500,000 - ¥2,500,000 per yearWe are looking for experienced SREs who can deliver insights into system bottlenecks and ensure system reliability and scalability. · We are looking for individuals who can bring informed and unique viewpoints, enjoy collaborating with a cross-functional team and are actively pus ...
-
Tokyo スキルハウス・スタッフィング・ソリューションズ株式会社 ¥1,000,000 - ¥1,500,000 per yearWe are seeking a Bilingual Linux & Platform Engineer to join our Tokyo-based Platform & Cloud Services team.The successful candidate will lead operations for RHEL, clustering, and related services, design and build new services and enhancements across platform and cloud environme ...
-
Tokyo CADDi ¥10,000,000 - ¥14,000,000We are looking for an Engineering Manager to maximize the productivity of the development organization within the CADDi Core Division and lead its business growth. · Maximize the results of the departments and groups you oversee. · Setting OKRs to chart the course forward and per ...
-
Tokyo Skillhouse Staffing Solutions K.K. ¥420,000 - ¥720,000 per yearOne of the largest Japanese securities groups is seeking a Bilingual Linux & Platform Engineer. · ...
-
Tokyo CADDi ¥8,000,000 - ¥12,000,000 per yearCADDi is on a mission to unleash the potential of manufacturing. The company has developed a groundbreaking product called CADDi DRAWER, which uses machine learning to structure and link critical design data. The product has received significant adoption from leading domestic man ...