Find your dream job NOW!

Click on Location links to filter by Job Title & Location.
Click on Company links to filter by Company & Location.
For exact match, enclose search terms in "double quotes".

Keywords: AI Inference Engineer, Location: Beijing

Page: 1

AI Inference Engineer

Job Description: WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a cul...

Location: Beijing
Posted Date: 16 Dec 2025

AI Software System Engineer(Kernel/Runtime)

your career. THE ROLE: As an AICE Software System Design Engineer, you will be responsible for the development, debugging..., distributions, compilers, performance optimizations for inference or training, along with strong programming skills in C...

Location: Beijing
Posted Date: 05 Mar 2026

Master Principal Cloud Engineer – GPU & AI Infrastructure

Job Category: Pre Sales Job Description: Position Overview As a GPU Specialist Cloud Engineer (CE) within the.... Optimization: Advise customers on right-sizing GPU shapes based on workload requirements (e.g., training vs. inference, FP8 vs...

Company: Oracle
Location: Beijing
Posted Date: 05 Mar 2026

Devtech Compute Engineer

and inference on GPU. You’ll join a team of ML, HPC and Software Engineers and Applied Researcher developing a framework designed...: In your role as Devtech Compute Engineer or CUDA Performance Engineer you will be primarily for the development of performance...

Company: Nvidia
Location: Beijing
Posted Date: 28 Feb 2026

Software Engineer for SPICE (AI)

) and their integration into Cadence’s EDA ecosystem. The engineer will architect intelligent systems that enhance productivity, automate... scalable systems for model training, inference, and integration with existing tools Collaborate with simulation and analysis...

Location: Beijing
Posted Date: 06 Feb 2026

Senior Research Engineer - Multimodal & Video Foundation Model

for multimodal language models, integrating text, visual, and audio modalities. Engineer scalable training and inference pipelines... experience working with the full development pipeline from data processing & data loading to training, inference...

Location: Beijing
Posted Date: 27 Jan 2026

Senior Machine Learning Engineer - AI Effects and Editing

robust pipelines for LoRA-based model training, post-training quantization, and inference optimisation. Develop... with LoRA training, model post-processing (quantization, pruning), and on-device inference optimisation. Familiarity with image...

Company: Canva
Location: Beijing
Posted Date: 11 Mar 2026

Senior Software Engineer

solutions encompassing backend AI service APIs, model inference optimization, and frontend interfaces to showcase new... AI models​ (e.g., diffusion models for image/video, GANs, autoregressive models). Building and optimizing inference pipelines...

Company: Microsoft
Location: Beijing
Posted Date: 11 Mar 2026

Senior Software Engineer

-solving skills in LLM inference optimization, token efficiency, and response tuning. Experience with AI frameworks...

Company: Microsoft
Location: Beijing
Posted Date: 06 Mar 2026

Senior Software Engineer

, or Triton. Optimize model inference and training pipelines for speed, throughput, memory efficiency, and cost across GPU... and architecture design. Familiar with inference optimization, experience in developing popular inference framework such as TensorRT...

Company: Microsoft
Location: Beijing
Posted Date: 05 Mar 2026

AI Product Performance Engineer

Integration: Collaborate with software stack teams to expose optimized kernels within high-level frameworks and inference engines... using OpenAI Triton or other Python-based DSLs for agile kernel development and auto-tuning. Inference Engine Experience...

Location: Beijing
Posted Date: 03 Mar 2026

AI Product performance Engineer

Integration: Collaborate with software stack teams to expose optimized kernels within high-level frameworks and inference engines... using OpenAI Triton or other Python-based DSLs for agile kernel development and auto-tuning. Inference Engine Experience...

Location: Beijing
Posted Date: 03 Mar 2026

Generative AI Algorithms Engineer

of multimodal inference and training, such as image generation, 3D, video generation, editing, ViT and other models. Efficient... inference algorithms research and advanced quantization, e.g. batching, KV caching, efficient attentions, long context...

Company: Qualcomm
Location: Beijing
Posted Date: 02 Mar 2026

Developer Technology Engineer - AI

, through both library development and direct contribution to the applications. This includes training and inference... of software design, programming techniques, and algorithms. Expert knowledge of LLM training/inference optimization, including...

Company: Nvidia
Location: Beijing
Posted Date: 01 Mar 2026

Senior Developer Technology Engineer

on maximizing training and inference speed while enabling effortless scalability. What You’ll Be Doing: Profile, analyze..., and optimize GPU‑accelerated code to improve training and inference performance for large‑scale recommender systems. Design...

Company: Nvidia
Location: Beijing
Posted Date: 28 Feb 2026

AI Video Research Engineer Intern

, supervised fine-tuning, post-training, inference, architecture design, or evaluation Benchmark models against current state...

Location: Beijing
Posted Date: 19 Feb 2026

Senior Software Engineer

operations (e.g., FlashAttention, GEMM, LayerNorm) to outperform standard libraries. Inference Engine Architecture: Contribute... to the development of our high-performance inference engine, focusing on graph optimizations, operator fusion, and dynamic...

Company: Microsoft
Location: Beijing
Posted Date: 14 Feb 2026

Algorithm Engineer

efficient inference on both server-side and embedded targets. Experimentation, Evaluation, and Knowledge Transfer to other team...

Company: Mercedes-Benz
Location: Beijing
Posted Date: 05 Feb 2026

Senior Machine Learning Engineer

for optimizing inference latency and cost Familiarity with GPU/TPU acceleration and distributed inference architectures Experience... Proficiency in deep learning frameworks (TensorFlow, PyTorch) and deployment tools (ONNX, tf-serving, TorchServe, Triton Inference...

Company: Grab
Location: Beijing
Posted Date: 23 Jan 2026

Software Engineer II

, transformer networks, reinforcement and transfer learning, etc.) Facility with classical methods of statistical inference...

Location: Beijing
Posted Date: 25 Dec 2025