Senior Solutions Engineer, AI/HPC Networking

Drivenets

Full-time

Remote

Worldwide

DevOps / Sysadmin

About the Company

DriveNets is a leader in disaggregated high-scale networking solutions for service providers and AI infrastructures. Founded in December 2015, DriveNets created a radical new way to build networks by adapting the architectural model of the cloud to telco-grade networking. This solution accelerates network deployment, improves the network’s economic model, and makes network operations much simpler. Customers include Comcast, Orange, and KDDI. Over 80% of AT&T’s network traffic now runs through a disaggregated core powered by DriveNets software. The DriveNets Network Cloud-AI solution, based on the same technology, was introduced to the market in 2023, providing the highest-performance Ethernet-based AI networking solution, and is already deployed by Hyperscalers, NeoClouds, and Enterprises. Having raised over $587 million in three funding rounds, DriveNets continues to deploy the most advanced network infrastructure and is looking for the most talented people to be part of this.

Responsibilities

Build strong AI/HPC infrastructure for new and existing customers.
Technical hands-on role in building and supporting NVIDIA/AMD based platforms.
Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, training stability, real-time monitoring, logging, and alerting.
Administer Linux systems, ranging from powerful GPU enabled servers to general-purpose compute systems.
Design and plan rack layouts and network topologies to support customer requirements.
Design and evaluate automation scripts for network operations, configuring server and switch fabrics.
Perform Data Center upgrades and make sure deployment of Drivenets solutions goes smoothly.
Install and configure Drivenets products, making sure performance is optimal and customers are satisfied.
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Engage in and improve the whole lifecycle of services from start and design through deployment, operation, and refinement.
Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
Engage with sales teams and customers to make sure success with major opportunities and deployments.
Introduce new products to the Drivenets sales and support teams and to Drivenets customers.
Deliver technical trainings and TOIs for support/sales engineers, partners, and customers.
Collaborate on product definition through customer requirement gathering and roadmap planning.

Requirements

BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields, or equivalent experience.
3+ years of network engineering (system/solution) experience.
3+ years of solution architecture/sales engineering experience, or equivalent, working for a vendor, value-added reseller, or system integrator.
Technical expertise in Data Center or high-end enterprise network design (e.g. BGP, EVPN, VXLAN, QoS, Multicast).
Expertise with datacenter design, including networking, compute, and storage.
Ability to write extensive technical content (white papers, technical briefs, etc.) for external audiences with a balance of technical accuracy, strategy, and clear messaging.
Ability to multitask efficiently in a multifaceted environment, ability to work with teams across geographical locations.
Clear written and oral communication skills with the ability to effectively collaborate with executives and engineering teams.
Ability to travel domestic and international up to 20% of the time.
Be Kind.

Preferred Qualifications (Nice to Have)

Familiarity with AI-relevant data center infrastructure and networking technologies such as: Infiniband, RoCEv2, lossless Ethernet technologies (PFC, ECN, etc), accelerated computing, GPU, NIC, DPU, etc.
Understanding of AI/HPC networking infrastructure solutions, their advantages and disadvantages (AI/HPC networking design, high-speed interconnect technologies).
Scale-up – NVLink, UALink, etc.
Scale-out – Ethernet and Enhanced Ethernet (Scheduled Ethernet, dynamic load balancing and adaptive routing, Spectrum-X, UEC, etc), InfiniBand.
Backend storage connectivity.
Understanding of data center operations fundamentals in networking, cooling, and power.
Familiarity with monitoring tools (e.g., Prometheus, Grafana, ELK Stack) and Telemetry (gRPC, gNMI, OTLP, etc).
Proven experience with one or more Tier-1 Clouds (AWS, Azure, GCP, or OCI) or emerging Neoclouds, as well as cloud-native architectures and software.

Location

Bay Area - remote. WFH-Remote role with travel to customers.

Apply now

Senior Solutions Engineer, AI/HPC Networking

More jobs

Senior Elektroplanungs-Ingenieur (m/w/d) für gewerbliche Photovoltaik & New Energy Systeme - Region Ost - 1KOMMA5° Commercial & Industrial Solutions

1Komma5°

Senior Application Security Analyst

Verda