Find your dream job NOW!

Click on Location links to filter by Job Title & Location.
Click on Company links to filter by Company & Location.
For exact match, enclose search terms in "double quotes".

Keywords: AI Inference Engineer, Location: Beijing

Page: 1

Principal Software Engineer (GPU inference)

Design and build a unified GPU inference platform for Ads, ensuring scalability, reliability, efficiency. Optimize... model inference via batching, quantization, scheduling, memory management, runtime optimization, kernel-level improvements...

Company: Microsoft
Location: Beijing
Posted Date: 14 Dec 2025

AI Inference Engineer

Job Description: WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a cul...

Location: Beijing
Posted Date: 16 Dec 2025

Data Engineer, Amazon Global Selling - AIT

Engineer to collaborate with cross-functional teams to design and develop data infrastructure and analytics capabilities... structured data inputs for AI model training and inference (e.g., LLM applications, recommendation systems), optimizing feature...

Company: Amazon
Location: Beijing
Posted Date: 25 Jan 2026

AI Framework Development Engineer

frameworks for AMD GPUs. Your experience will be critical in enhancing GPU kernels, deep learning models, and training/inference... principles to drive continuous improvement. THE PERSON: Skilled engineer with strong technical and analytical expertise in C...

Location: Beijing
Posted Date: 23 Jan 2026

Software Development Engineer

frameworks for AMD GPUs. Your experience will be critical in enhancing GPU kernels, deep learning models, and training/inference... principles to drive continuous improvement. THE PERSON: Skilled engineer with strong technical and analytical expertise in C...

Location: Beijing
Posted Date: 21 Jan 2026

Tech Lead (Platform Architect + Applied AI Engineer)- Digital Twin & Clinical Ai

+ Applied AI Engineer)- Digital Twin & Clinical Ai - Remote (Contractor) Location: Remote - Global - Philippines, Vietnam...%); a few real projects can be sufficient Bonus “brownie points”: Experience deploying AI models/LLMs on the edge (edge inference...

Company: Shae Group
Location: Beijing
Posted Date: 08 Jan 2026

Senior Machine Learning Systems Engineer (Training Optimization)

As a Senior Machine Learning Systems Engineer, you'll lead efforts to scale and optimize the training system for our large-scale... do (responsibilities) You'll design, implement, and optimize large-scale machine learning systems for training and inference. You'll...

Company: Canva
Location: Beijing
Posted Date: 15 Nov 2025

Developer Technology Engineer Intern

and inference on GPU. You’ll join a team of ML, HPC and Software Engineers and Applied Researcher developing a framework designed...: In your role as CUDA Engineer Intern you will be profiling and investigating the performance of optimized code together...

Company: Nvidia
Location: Beijing
Posted Date: 05 Nov 2025

Machine Learning Engineer

such as TensorFlow or PyTorch and deployment tools (ONNX, tf-serving, TorchServe, Triton Inference Server) Solid software engineering...

Company: Grab
Location: Beijing
Posted Date: 23 Jan 2026

Senior Machine Learning Engineer

for optimizing inference latency and cost Familiarity with GPU/TPU acceleration and distributed inference architectures Experience... Proficiency in deep learning frameworks (TensorFlow, PyTorch) and deployment tools (ONNX, tf-serving, TorchServe, Triton Inference...

Company: Grab
Location: Beijing
Posted Date: 23 Jan 2026

AI Product performance Engineer

Integration: Collaborate with software stack teams to expose optimized kernels within high-level frameworks and inference engines... kernels using OpenAI Triton or other Python-based DSLs for agile kernel development and auto-tuning. Inference Engine...

Location: Beijing
Posted Date: 21 Jan 2026

AI/ML Algorithms Engineer

, as well as quantization-aware training built on top of advanced quantization methods. Efficient inference algorithms research...

Company: Qualcomm
Location: Beijing
Posted Date: 15 Jan 2026

Senior Machine Learning Engineer - AI Effects and Editing

robust pipelines for LoRA-based model training, post-training quantization, and inference optimisation. Develop... with LoRA training, model post-processing (quantization, pruning), and on-device inference optimisation. Familiarity with image...

Company: Canva
Location: Beijing
Posted Date: 11 Jan 2026

Senior Software Engineer

, and a deep understanding of prompt engineering techniques. Solid problem-solving skills in LLM inference optimization, token..., but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience. Experience in optimizing LLM inference...

Company: Microsoft
Location: Beijing
Posted Date: 31 Dec 2025

Software Engineer II

, transformer networks, reinforcement and transfer learning, etc.) Facility with classical methods of statistical inference...

Location: Beijing
Posted Date: 25 Dec 2025

Senior Software Engineer

#, and Python. You will design and implement the core inference for our exceptional OCR and document layout analysis engine.... Inference Optimization Strategy: Spearhead efforts to optimize deep learning model inference for maximum speed and throughput...

Company: Microsoft
Location: Beijing
Posted Date: 15 Dec 2025

Principal Software Engineer

innovative system optimization solutions for internal LLM workloads. - Optimize LLM inference workloads through innovative kernel..., algorithm, scheduling, and parallelization technologies. - Continuously develop and maintain internal LLM inference...

Company: Microsoft
Location: Beijing
Posted Date: 10 Dec 2025

Senior Software Engineer

technical problems, advance state-of-the-art LLM technologies, and translate ideas into production. - Optimize LLM inference... LLM inference infrastructure. - A bachelor's degree or higher in computer science, engineering, or a related field, PhD...

Company: Microsoft
Location: Beijing
Posted Date: 10 Dec 2025

Generative AI Software Engineer

SDK toolchain Implement and optimize inference drivers for large language models (LLM) and large multimodal models (LMM..., transformer and the Hugging Face ecosystem Knowledge of LLM/LMM inference engines, such as llama.cpp or ExecuTorch Experience...

Company: Qualcomm
Location: Beijing
Posted Date: 07 Dec 2025

Algorithm Engineer

efficient inference on both server-side and embedded targets. Experimentation, Evaluation, and Knowledge Transfer to other team...

Company: Mercedes-Benz
Location: Beijing
Posted Date: 11 Nov 2025