Back to Blog

Nvidia Jetson/Orin +FPGA+AI High-Compute Edge AI Box: Meituan Xiaodai Autonomous Delivery Vehicle

#AI#EdgeComputing

Meituan's Xiaodai autonomous delivery robot represents one of the more ambitious real-world deployments of edge AI in last-mile logistics — a compact, battery-powered vehicle navigating busy urban environments to carry food orders from restaurant to doorstep, powered at its core by the NVIDIA Jetson AGX Xavier. This post looks at the hardware choices behind Xiaodai and why a robot of its size still demands the same computational firepower as a full-scale self-driving car.

Meituan and the Last-Mile Problem

Meituan-Dianping occupies a unique position in Chinese tech: it blends food delivery, local business discovery, and group commerce into a single platform, partnering with more than 400,000 local businesses. In scale and business model it sits somewhere between Uber Eats, Yelp, and Groupon rolled into one. With that kind of delivery volume, even incremental automation of the last-mile segment — the notoriously expensive and labor-intensive stretch between a restaurant and a customer's door — has enormous economic leverage.

Xiaodai (小袋, meaning "small bag") is Meituan's answer to that problem. The vehicle is compact enough to operate on pedestrian paths and building lobbies, carries its own battery, and is designed to handle the full end-to-end delivery loop autonomously: navigating from a restaurant pick-up point, routing through mixed pedestrian and vehicle traffic, and completing delivery at the destination.

Why a Small Robot Needs Big Compute

The intuitive assumption is that a vehicle small enough to fit in a lobby requires proportionally less processing power than, say, a full-size autonomous car. That assumption is wrong, and it is the central engineering insight behind the Xiaodai hardware architecture.

Sensing, localization, and path planning are compute-bound problems that scale with environmental complexity, not vehicle size. Xiaodai must:

  • Fuse multi-modal sensor streams — typically a combination of cameras, LiDAR, ultrasonic sensors, and IMU — into a coherent, real-time model of its surroundings.
  • Localize precisely in GPS-denied environments such as underground parking garages, building lobbies, and narrow alleyways, where satellite positioning is unavailable or unreliable.
  • Plan and replan paths dynamically around pedestrians, cyclists, other delivery vehicles, and unpredictable obstacles — all at the low latency needed to avoid collisions at walking speed.

Each of these tasks involves deep neural network inference running continuously, often at 30 fps or higher. The aggregate compute demand is functionally equivalent to what larger autonomous vehicles require, compressed into a platform that must also stay within strict power and thermal budgets dictated by the onboard battery.

The NVIDIA Jetson AGX Xavier as Core Compute

The Jetson AGX Xavier addresses this constraint directly. It delivers up to 32 TOPS (tera-operations per second) of AI performance within a 10–30 W power envelope, integrating a Volta GPU, an eight-core ARM v8.2 CPU, dual NVDLA (NVIDIA Deep Learning Accelerator) engines, and a dedicated vision accelerator on a single module. For a battery-powered edge robot, the ability to run multiple simultaneous neural networks — perception, depth estimation, object detection, semantic segmentation — without a separate discrete GPU or wall power is decisive.

The platform also runs NVIDIA's full software stack including TensorRT for optimized inference, CUDA, and the robotics-oriented Isaac SDK, which provides pre-built building blocks for sensor fusion, localization (including visual odometry and lidar-based SLAM), and motion planning. This reduces the engineering effort to integrate the hardware with the rest of the system.

Xia Huaxia, General Manager of Meituan, summarized the strategic rationale: "Autonomous delivery vehicles are crucial for the development of the logistics industry and can significantly improve distribution and delivery. We look forward to leveraging the powerful AI capabilities of the Jetson AGX Xavier to enhance the functionality of Xiaodai autonomous delivery robots."

Real-World Test Sites

Meituan is validating Xiaodai across three geographically and operationally distinct environments:

  • Beijing Chaoyang Joy City — a large urban shopping and dining complex, presenting dense pedestrian traffic, narrow corridors, and multiple elevator interactions.
  • Lenovo's Shenzhen office campus — a controlled corporate environment with predictable traffic patterns, useful for refining baseline autonomy before scaling to public spaces.
  • Xiong'an New Area — a planned smart-city development southwest of Beijing, where infrastructure is being built with autonomous vehicles in mind and regulatory pathways are more accommodating.

The diversity of these test environments is deliberate. Each presents a different mix of sensor challenges, crowd density, and infrastructure quality, and validating across all three gives Meituan a more robust picture of where the system's edge cases lie before broader commercial deployment.

Broader Significance for Edge AI in Robotics

Xiaodai is part of a broader shift in robotics deployments toward high-density, purpose-built AI compute at the edge. The article's title references an FPGA alongside the Jetson platform — a common pairing in industrial and robotics applications where FPGAs handle time-critical, deterministic tasks such as sensor pre-processing, hardware interfacing, and hard real-time control loops, while the Jetson handles the neural-network-heavy perception and planning workloads. This division of labor lets each compute element operate where it is most efficient.

For edge AI practitioners, the Xiaodai deployment is a useful data point: it demonstrates that the compute requirements for real-world autonomous navigation do not decrease just because the platform is small, and that the Jetson AGX Xavier class of hardware has reached the point where it can serve as the sole AI compute backbone for a commercially deployed, urban-environment robot.