For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology.
Our history of innovation drives us to solve the worlds hardest problems.NVIDIA is looking for Senior Cloud Infrastructure/DevOps Solutions Architect to join its NVIDIA Infrastructure Specialist Team.
Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers.
Join the team building many of the largest and fastest AI/HPC systems in the world We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent interpersonal skills.
This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale Networking projects.
The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customerWhat You'll Be Doing
Maintain large scale HPC/AI clusters with monitoring, logging and alerting Manage Linux job/workload schedulers and orchestration tools.
Develop and maintain continuous integration and delivery pipelines
Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
Deploy monitoring solutions for the servers, network and storage.Perform troubleshooting bottom up from bare metal, operating system, software stack and application level.
Being a technical resource, develop, re-define and document standard methodologies to share with internal teams Support Research & Development activities and engage in POCs/POVs for future improvements .
What We Need To SeeBS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture.
Knowledge of HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and supporting software.
Extensive knowledge and hands-on experience with Kubernetes, including container orchestration for AI/ML workloads, resource scheduling, scaling, and integration with HPC environments.
Experience in managing and installing HPC clusters, including deployment, optimization, and troubleshooting.Experience with job scheduling workloads and orchestration technologies such as Slurm, Kubernetes, and Singularity.
Excellent knowledge of Windows and Linux systems (Redhat/CentOS and Ubuntu), including internals, ACLs, OS-level security protections, and common protocols like TCP, DHCP, DNS, etc.
Experience with multiple storage solutions, including Lustre, GPFS, ZFS, and XFS. Familiarity with newer and emerging storage technologies is a plus.Proficiency in Python programming and bash scripting.
Knowledge of CI/CD pipelines for software deployment and automation.
Comfortable with automation and configuration management tools, including Jenkins, Ansible, Puppet/Chef, etc.
Ability to communicate technical concepts and collaborate effectively with Japanese-speaking customers.
Ways To Stand Out From The Crowd
Knowledge of CPU and/or GPU architecture .
Knowledge of Kubernetes, container related microservice technologies.
Experience with GPU-focused hardware/software (DGX, CUDA.)
Background with RDMA (InfiniBand or RoCE) fabrics.
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking individuals in the world working for us. If you're creative and autonomous, we want to hear from you.
JR1997336
Show more Show less
-
Tokyo NVIDIA ¥120,000 - ¥180,000 per yearNVIDIA is looking for Senior Cloud Infrastructure/DevOps Solutions Architect to join its NVIDIA Infrastructure Specialist Team. · Maintain large scale HPC/AI clusters with monitoring, logging and alerting · Manage Linux job/workload schedulers and orchestration tools · Develop an ...
-
Japan-Tokyo-Mori Tower Tencent Full time¥4,000,000 - ¥12,000,000 per yearDevelop and implement customized cloud solutions for clients, driving successful project delivery. Act as the primary point of communication between clients and the backend team, ensuring alignment on requirements and industry standards while closely tracking project progress. · ...
-
Japan-Tokyo-Mori Tower Tencent ¥10,000,000 - ¥15,000,000 per yearDevelop and implement customized cloud solutions for clients, driving successful project delivery. Act as the primary point of communication between clients and the backend team, ensuring alignment on requirements and industry standards while closely tracking project progress. Pa ...
-
Greater Tokyo Area Tencent ¥9,000,000 - ¥12,000,000 per yearDevelop and implement customized cloud solutions for clients, driving successful project delivery. Act as the primary point of communication between clients and the backend team, ensuring alignment on requirements and industry standards while closely tracking project progress. · ...
-
Tokyo IBM ¥4,000,000 - ¥12,000,000 per yearAWSソリューションアーキテクトのポジションは、IBMでキャリアを育む社員が世界各国のお客様とIBMとの関係をさらに深め、協業を推進する役割です。 · お客様に価値ある変革をもたらすクリエイティブなソリューションを作り出すためにあなたの能力が求められます。 · ...
-
Software Engineer I
1ヶ月前
Tokyo NCR Voyix ¥4,000,000 - ¥12,000,000 per year日本NCRコマースは、小売、レストラン、デジタルバンキングのお客様をテクノロジーの力でご支援するグローバル サービス プロバイダーです。日本NCRコマースの提供する包括的なプラットフォーム主導のSaaS及びサービス機能により、流通・金融業界の顧客体験の変革を実現します。 · コンピュータサイエンスまたは関連分野の学士号。修士号が望ましい。 · 10年以上の製品設計および開発の経験 · コンピュータサイエンスの基本原理の深い理解 · 強力な開発、デバッグ、およびトラブルシューティングスキル · 優れた書面および口頭のコミュニケーションスキル · チームでの ...
-
Solutions Architect
4週間前
Tokyo Snowflake ¥8,000,000 - ¥20,000,000 per year日本のProfessional Servicesチームでは、お客様にSnowflakeを最大限かつ最適に活用頂くために、技術支援を展開するソリューションアーキテクト(Solutions Architect)を募集しています。このロールでは、お客様のビジネス課題とSnowflakeのソリューションを結びつけ、その関連性とSnowflakeのビジョンをさまざまなオーディエンス(技術および経営層)に伝える洞察力が求められています。 · 私たちは、データプラットフォームを有している、もしくはこれから構築していくお客様のさまざまな課題を解決し、改革するパッションを ...
-
Software Engineer I
1ヶ月前
Tokyo NCR Voyix Full time¥4,000,000 - ¥12,000,000 per year日本NCRコマースは、小売、レストラン、デジタルバンキングのお客様をテクノロジーの力でご支援するグローバル サービス プロバイダーです。日本NCRコマースの提供する包括的なプラットフォーム主導のSaaS及びサービス機能により、流通・金融業界の顧客体験の変革を実現します。 · ...
-
Solutions Architect
3週間前
Tokyo Snowflake ¥15,000,000 - ¥25,000,000 per yearSnowflakeはデータ革命の最前線に立ち、世界最高のデータとAIのプラットフォームの構築に取り組んでいます。私たちのミッションはデータサイロ化を解消して、「世界中のデータをモビライズすること」です。私たちと一緒に急成長していくデータの未来を一緒に築いていきませんか。 · 顧客自身でSnowflakeの機能を正しく理解し更に拡張して使って頂けるように、ナレッジの共有を含めベストプラクティスに沿ってSnowflakeをデプロイします。 · 顧客と直接手を動かしながら、Snowflakeテクノロジーの実装に関するベストプラクティスを実証し、伝えます。 · ...
-
システムアーキテクトディレクター
3週間前
Tokyo HirePlanner Japan | Find Jobs in Japan, Work in Japan, Careers in Japan 🇯🇵 ¥20,000,000 - ¥25,000,000 per yearシステムアーキテクトディレクターは、カタリナマーケティングジャパンのシステムアーキテクトディレクターとして、システムの設計、開発、運用を担当します。主な責任は、システムの可用性、運用、保守性の領域において、エンジニアとPMと共に連携し、必要となる可用性、運用、保守性のレベルの明確化し、それを元にした実際のシステム、インフラをエンジニアと共にデリバリーすることです。 ...
-
Tokyo スキルハウス・スタッフィング・ソリューションズ株式会社 ¥7,000,000 - ¥14,000,000 per yearITセキュリティ分野での経験を、新たなステージで発揮し、キャリアをさらに広げてみませんか? · 上記技術分野での5年以上の経験 · 情報セキュリティ:リスク監査 · 情報セキュリティ:ガバナンス · 情報セキュリティ:システムリスク評価 · 情報セキュリティ:リスクアナリスト · ITセキュリティスペシャリスト(エンジニア・アーキテクト、インフラエンジニア) · ITセキュリティ:コンプライアンス · サイバーセキュリティー · DFIRアナリスト · 社内セキュリティ · 上記技術分野 · 社会保険完備 · 屋内禁煙 · 交通費支給 · ...
-
Enterprise Architect
2ヶ月前
Tokyo Pegasystems ¥12,000,000 - ¥24,000,000 per yearWe are a team of enterprise transformation experts leveraging Pega's AI-powered platform to modernize legacy systems across Japan's leading enterprises. · Re-architect legacy assets (e.g., mainframe, COBOL, Lotus Notes) into cloud-native solutions using Amazon Q and Pega Blueprin ...
-
Tokyo Pfizer ¥10,000,000 - ¥20,000,000 per yearThe Sr. Manager/Staff Engineer, AI Infrastructure & MLOps Engineering is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms ...
-
Tokyo myGwork - LGBTQ+ Business Community ¥8,000,000 - ¥15,000,000 per yearThis job is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms supporting AI model training, genomic sequencing, and precisi ...
-
Tokyo Pfizer Full time$120,000 - $240,000 per yearThe Sr. Manager/Staff Engineer, AI Infrastructure & MLOps Engineering is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms ...
-
Tokyo TektomeAt Tektome, we're at the forefront of technological innovation for the AEC industry. We are a spin out of one of the first AI companies in Japan, Incubit. We're looking for dedicated professionals who share our passion and ambition. · Designing and managing Azure infrastructure u ...
-
Sr. Cloud Engineer
2ヶ月前
Tokyo First Point Group ¥1,800,000 - ¥2,500,000 per yearWe are seeking a highly skilled Senior Cloud Engineer with deep expertise in Microsoft Azure to join one of the main Telecom companies worldwide.Architect, deploy, and manage Azure infrastructure tailored for Fortinet solutions. · Lead the deployment and integration of FortiPAM w ...
-
Tokyo Oracle ¥10,000,000 - ¥20,000,000 per yearWe are seeking a highly skilled Infrastructure Lead to architect, design, and deliver cloud-native infrastructure for COTS applications, with a focus on Oracle Billing and Revenue Management (BRM). · Architect and implement cloud-native infrastructure solutions for Oracle BRM and ...
-
Solutions Architect
1ヶ月前
Tokyo E-Solutions ¥120,000 - ¥180,000 per yearWe are looking for an experienced and customer-focused AWS Infrastructure Architect to join our growing cloud team in Tokyo. · Design and implement AWS Landing Zones using Control Tower and multi-account strategies. · Define and build secure, resilient, and scalable architectures ...
-
Tokyo Capgemini Full time¥4,000,000 - ¥10,000,000 per yearDesign and govern IT infrastructure solutions across on-premise, hybrid, and public cloud environments, ensuring scalability, security, and compliance. · VMware, Citrix, Windows/Linux servers, networking. · Cloud platforms (AWS/Azure), DevOps practices. · Fluent Japanese (JLPT N2 ...
-
Tokyo Rakuten ¥1,200,000 - ¥1,500,000 per year楽天ペイメントの事業は急速に拡大しており、新しいサービスの導入や既存サービスの拡充に伴い、業務領域も広がっています。この成長を支えるために、クラウド環境の管理と運用を強化する必要があります。クラウドエンジニアとして、Azure、Google Cloud、Private Cloudの複数の環境を管理し、最新のアップデートやセキュリティ対策を確実に行っていただきます。主にIaaSに焦点を当てますが、PaaSやSaaSの一部も担当していただきます。 ...