For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology.
Our history of innovation drives us to solve the worlds hardest problems.NVIDIA is looking for Senior Cloud Infrastructure/DevOps Solutions Architect to join its NVIDIA Infrastructure Specialist Team.
Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers.
Join the team building many of the largest and fastest AI/HPC systems in the world We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent interpersonal skills.
This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale Networking projects.
The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customerWhat You'll Be Doing
Maintain large scale HPC/AI clusters with monitoring, logging and alerting Manage Linux job/workload schedulers and orchestration tools.
Develop and maintain continuous integration and delivery pipelines
Develop tooling to automate deployment and management of large-scale infrastructure environments, to automate operational monitoring and alerting, and to enable self-service consumption of resources.
Deploy monitoring solutions for the servers, network and storage.Perform troubleshooting bottom up from bare metal, operating system, software stack and application level.
Being a technical resource, develop, re-define and document standard methodologies to share with internal teams Support Research & Development activities and engage in POCs/POVs for future improvements .
What We Need To SeeBS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture.
Knowledge of HPC and AI solution technologies, including CPUs, GPUs, high-speed interconnects, and supporting software.
Extensive knowledge and hands-on experience with Kubernetes, including container orchestration for AI/ML workloads, resource scheduling, scaling, and integration with HPC environments.
Experience in managing and installing HPC clusters, including deployment, optimization, and troubleshooting.Experience with job scheduling workloads and orchestration technologies such as Slurm, Kubernetes, and Singularity.
Excellent knowledge of Windows and Linux systems (Redhat/CentOS and Ubuntu), including internals, ACLs, OS-level security protections, and common protocols like TCP, DHCP, DNS, etc.
Experience with multiple storage solutions, including Lustre, GPFS, ZFS, and XFS. Familiarity with newer and emerging storage technologies is a plus.Proficiency in Python programming and bash scripting.
Knowledge of CI/CD pipelines for software deployment and automation.
Comfortable with automation and configuration management tools, including Jenkins, Ansible, Puppet/Chef, etc.
Ability to communicate technical concepts and collaborate effectively with Japanese-speaking customers.
Ways To Stand Out From The Crowd
Knowledge of CPU and/or GPU architecture .
Knowledge of Kubernetes, container related microservice technologies.
Experience with GPU-focused hardware/software (DGX, CUDA.)
Background with RDMA (InfiniBand or RoCE) fabrics.
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking individuals in the world working for us. If you're creative and autonomous, we want to hear from you.
JR1997336
Show more Show less
-
Tokyo NVIDIA ¥120,000 - ¥180,000 per yearNVIDIA is looking for Senior Cloud Infrastructure/DevOps Solutions Architect to join its NVIDIA Infrastructure Specialist Team. · Maintain large scale HPC/AI clusters with monitoring, logging and alerting · Manage Linux job/workload schedulers and orchestration tools · Develop an ...
-
Japan-Tokyo-Mori Tower Tencent+Develop and implement customized cloud solutions for clients, driving successful project delivery. · ...
-
Greater Tokyo Area Tencent ¥9,000,000 - ¥12,000,000 per yearDevelop and implement customized cloud solutions for clients, driving successful project delivery. Act as the primary point of communication between clients and the backend team, ensuring alignment on requirements and industry standards while closely tracking project progress. · ...
-
Tokyo IBM ¥4,000,000 - ¥12,000,000 per yearAWSソリューションアーキテクトのポジションは、IBMでキャリアを育む社員が世界各国のお客様とIBMとの関係をさらに深め、協業を推進する役割です。 · お客様に価値ある変革をもたらすクリエイティブなソリューションを作り出すためにあなたの能力が求められます。 · ...
-
Solutions Architect
1ヶ月前
Tokyo Snowflake ¥15,000,000 - ¥25,000,000 per yearSnowflakeはデータ革命の最前線に立ち、世界最高のデータとAIのプラットフォームの構築に取り組んでいます。私たちのミッションはデータサイロ化を解消して、「世界中のデータをモビライズすること」です。私たちと一緒に急成長していくデータの未来を一緒に築いていきませんか。 · 顧客自身でSnowflakeの機能を正しく理解し更に拡張して使って頂けるように、ナレッジの共有を含めベストプラクティスに沿ってSnowflakeをデプロイします。 · 顧客と直接手を動かしながら、Snowflakeテクノロジーの実装に関するベストプラクティスを実証し、伝えます。 · ...
-
Solutions Architect
2ヶ月前
Tokyo Snowflake ¥8,000,000 - ¥20,000,000 per year日本のProfessional Servicesチームでは、お客様にSnowflakeを最大限かつ最適に活用頂くために、技術支援を展開するソリューションアーキテクト(Solutions Architect)を募集しています。このロールでは、お客様のビジネス課題とSnowflakeのソリューションを結びつけ、その関連性とSnowflakeのビジョンをさまざまなオーディエンス(技術および経営層)に伝える洞察力が求められています。 · 私たちは、データプラットフォームを有している、もしくはこれから構築していくお客様のさまざまな課題を解決し、改革するパッションを ...
-
システムアーキテクトディレクター
1ヶ月前
Tokyo HirePlanner Japan | Find Jobs in Japan, Work in Japan, Careers in Japan 🇯🇵 ¥20,000,000 - ¥25,000,000 per yearシステムアーキテクトディレクターは、カタリナマーケティングジャパンのシステムアーキテクトディレクターとして、システムの設計、開発、運用を担当します。主な責任は、システムの可用性、運用、保守性の領域において、エンジニアとPMと共に連携し、必要となる可用性、運用、保守性のレベルの明確化し、それを元にした実際のシステム、インフラをエンジニアと共にデリバリーすることです。 ...
-
Tokyo クラスメソッド株式会社 ¥6,500,000 - ¥9,000,000シニアクラウドエンジニアとして、顧客との深い信頼関係を構築しながら、最適なクラウドソリューションの設計・構築・移行プロジェクト等を推進していただきます。 · ...
-
Tokyo スキルハウス・スタッフィング・ソリューションズ株式会社 ¥7,000,000 - ¥14,000,000 per yearスキルハウスでは、社員一同が求職者の皆様のキャリアを長期的にサポートしていきたい、という熱い情熱を持っています。現在のご要望にあったポジションだけでなく、皆様のキャリアやその後のプランを一緒に作り上げましょう。 · Skillhouse Staffing Solutions K.K. · ...
-
Tokyo DatadogDatadogのセールスエンジニアは、顧客やパートナーとの商談を成立させるためのサポートを行います。現状のモニタリング・運用などの体制に課題をかかえ改善していきたいお客様や、Datadogの購入を検討頂いているお客様などにテクニカルデモンストレーション、技術評価(POV)、疑問点・問題点の解決等の提案・サポートを通じて、技術的な専門知識を提供します。 · 営業チームと連携し、Datadogのバリュープロポジション、ビジョン、戦略を顧客に明確に伝えること · 新しい技術を継続的に学習し、競争力のある知識、技術スキル،信頼性を作る · ...
-
Tokyo スキルハウス・スタッフィング・ソリューションズ株式会社 ¥7,000,000 - ¥24,000,000 per yearITセキュリティ分野での経験を、新たなステージで発揮し、キャリアをさらに広げてみませんか? · Skillhouseでは現在多数のITセキュリティスペシャリストのポジションを保有しております。 · ...
-
Tokyo Rakuten ¥1,200,000 - ¥1,500,000 per year楽天ペイメントの事業は急速に拡大しており、新しいサービスの導入や既存サービスの拡充に伴い、業務領域も広がっています。この成長を支えるために、クラウド環境の管理と運用を強化する必要があります。クラウドエンジニアとして、Azure、Google Cloud、Private Cloudの複数の環境を管理し、最新のアップデートやセキュリティ対策を確実に行っていただきます。主にIaaSに焦点を当てますが、PaaSやSaaSの一部も担当していただきます。 ...
-
Tokyo Pfizer ¥10,000,000 - ¥20,000,000 per yearThe Sr. Manager/Staff Engineer, AI Infrastructure & MLOps Engineering is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms ...
-
Tokyo myGwork - LGBTQ+ Business Community ¥8,000,000 - ¥15,000,000 per yearThis job is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms supporting AI model training, genomic sequencing, and precisi ...
-
Tokyo Pfizer Full time$120,000 - $240,000 per yearThe Sr. Manager/Staff Engineer, AI Infrastructure & MLOps Engineering is a senior technical leader responsible for architecting, building, and scaling Pfizer's AI infrastructure and developer platforms. · Design, implement, and own large-scale cloud-based HPC and MLOps platforms ...
-
Solutions Architect
2ヶ月前
Tokyo E-Solutions ¥120,000 - ¥180,000 per yearWe are looking for an experienced and customer-focused AWS Infrastructure Architect to join our growing cloud team in Tokyo. · Design and implement AWS Landing Zones using Control Tower and multi-account strategies. · Define and build secure, resilient, and scalable architectures ...
-
Tokyo Capgemini Full time¥4,000,000 - ¥10,000,000 per yearDesign and govern IT infrastructure solutions across on-premise, hybrid, and public cloud environments, ensuring scalability, security, and compliance. · VMware, Citrix, Windows/Linux servers, networking. · Cloud platforms (AWS/Azure), DevOps practices. · Fluent Japanese (JLPT N2 ...
-
English-Only】 Cloud
2週間前
Tokyo Michael Page ¥1,200,000 - ¥1,500,000 per yearLead cloud infrastructure and deployment for cutting-edge AI services. · Ccollaborate with an international team across multiple countries. · Manage and maintain infrastructure on GCP, Azure, and private cloud platforms. · Deploy and scale applications efficiently across various ...
-
Solutions Architect
2ヶ月前
Tokyo E-Solutions ¥2,000,000 - ¥2,500,000 per yearSeeking an experienced and customer-focused AWS Solutions Architect with 10+ years of experience in cloud migration, infrastructure delivery, and architecture design. · Designing and implementing AWS infrastructure using Control Tower, Organizations, SCPs, CloudWAN, Transit Gatew ...
-
Cloud Engineer
2ヶ月前
Tokyo Rakuten ¥4,000,000 - ¥12,000,000 per yearRakuten is seeking a skilled and experienced Cloud Engineer to join our dynamic Recommendations and Personalization team in Tokyo. · Lead and contribute significantly to the design, development, and optimization of core backend web application services, primarily in Python. · Arc ...
-
Tokyo Rakuten Mobile, Inc.We are seeking a highly experienced and visionary Cloud Solution Architect to join our dynamic team, focusing specifically on Core Network Applications. · Design and implement Container Ingress Services (CIS) for seamless integration with Kubernetes clusters. · Design and deploy ...