Job # 9371RK

  • New York, NY

Machine Learning Performance Engineer – Global Trading Firm

Favorite

Job Overview:

Our client, a global leader in quantitative trading, is seeking a Machine Learning Performance Engineer to join its New York engineering team. This role plays a critical part in advancing large-scale deep learning infrastructure that supports high-frequency and futures trading strategies.

Key Responsibilities

  • Design, optimize, and deploy scalable training and inference pipelines for deep learning models.
  • Enhance and extend open-source ML frameworks to improve performance and reliability.
  • Identify and resolve GPU and system-level bottlenecks for accelerated model training.
  • Collaborate closely with researchers, quants, and software engineers to integrate ML models into trading systems.
  • Develop a deep understanding of trading workflows, market data, and performance-critical systems.

Qualifications

  • Strong expertise in GPU programming (CUDA)—Tensor Cores, cooperative groups, graphs, and warp-level intrinsics.
  • Proven experience with deep learning internals in PyTorch, TensorFlow, or JAX.
  • Deep understanding of computer architecture, parallel computing, and large-scale distributed training.
  • Proficiency in C++ and Python.

Preferred: Experience with JAX ecosystem (XLA, Flax), GPU libraries (Triton, cuDNN, cuBLAS), Linux system programming, and open-source ML contributions.

Why Apply

Join a team of world-class engineers and researchers tackling some of the most advanced challenges in trading and AI infrastructure. This is a greenfield opportunity to shape GPU- and TPU-based deep learning pipelines at scale within a high-impact, globally recognized trading firm.

Submit Your Resume
All choices have reached their entry limit

Maximum file size: 2.1MB

Call or Email Us

781.599.9300 team@ndt.com