Back to Blog

Domestic RK3568+FPGA Solution Centered on "Real-time Control + High-Precision Acquisition + Flexible Expansion"

#fpga开发#机器人#人工智能#大数据#网络#运维

Domestic RK3568+FPGA Solution: Real-Time Control, High-Precision Acquisition, and Flexible Expansion

As industrial automation moves deeper into edge-AI territory, system designers face a familiar tension: general-purpose application processors offer rich software ecosystems and AI inference capabilities, but they cannot match the deterministic timing of dedicated hardware. Meanwhile, standalone FPGAs deliver the real-time control and high-speed I/O that factories demand, but they lack the compute horsepower to run modern perception models. The RK3568+FPGA pairing resolves this tension by treating the two devices as complementary halves of a single control plane—and doing so entirely on domestically sourced silicon.

This post walks through the core technical architecture of this combined solution, its key performance characteristics, and how it has been deployed across sectors ranging from precision manufacturing to energy-grid inspection.


Architecture Overview

The RK3568 is a quad-core ARM Cortex-A55 SoC from Rockchip, clocked at up to 2.0 GHz, integrating a 0.8 TOPS NPU, a Mali-G52 GPU, and a full suite of industrial connectivity peripherals including dual Gigabit Ethernet, CAN bus, PCIe 3.0, and a Flexible Serial Peripheral Interface (FSPI). The companion FPGA handles all tasks that require cycle-accurate, deterministic timing or custom I/O logic—tasks that a Linux-running application processor cannot reliably meet without specialized real-time extensions.

The two chips communicate over two complementary paths depending on payload size and latency budget: FSPI for small, latency-sensitive messages and PCIe 3.0 DMA for bulk data transfer.


Real-Time Control via AMP and GPIO Interrupts

One of the most demanding requirements in industrial control is emergency response latency. A chemical reactor that needs to halt within microseconds of a fault signal cannot afford OS scheduling jitter.

The solution uses an Asymmetric Multi-Processing (AMP) architecture. The RK3568's master cores run Linux for HMI, networking, and AI inference workloads. A dedicated core—or the FPGA itself acting as a real-time coprocessor—runs a bare-metal or RTOS firmware layer. When an emergency input arrives, the FPGA slave asserts a GPIO interrupt to the real-time core, completely bypassing the Linux scheduler. The result is an end-to-end interrupt-to-action latency of 4 μs.

Practical deployments of this mechanism include:

  • Emergency stop (E-stop) circuits on robotic arms, where the interrupt path must be guaranteed regardless of Linux load
  • Chemical process safety interlocks, where a case study showed spurious-action rates drop from 0.5% to 0.02% after migration to the AMP architecture
  • Multi-axis servo synchronization, where the FPGA processes 16 encoder channels and pushes position data into RK3568 RAM via PCIe DMA, keeping cycle jitter below 1 μs across six simultaneous servo axes

High-Precision Multi-Channel Data Acquisition

Industrial condition monitoring—vibration analysis, thermal trending, pressure logging—requires simultaneous, phase-coherent sampling across many channels. The FPGA implements this front-end acquisition layer using chips such as the AD7616, a 16-channel, 16-bit SAR ADC capable of 1 MSPS aggregate throughput.

The FPGA timestamps and packetizes the raw samples, then routes them over FSPI or DMA into RK3568 memory. The NPU then runs lightweight inference models for real-time filtering and anomaly detection—tasks like identifying a bearing's fault signature in a vibration spectrum or flagging an out-of-range thermocouple.

A precision manufacturing deployment reported that fault-prediction accuracy for motor and mechanical subsystems reached 92%, while unplanned downtime fell by 40% compared to the previous threshold-based alerting system.


Image Processing: Infrared Fusion and 4K Video

For applications that combine thermal and visible-light imagery—drone power-line inspection, perimeter security, quality vision systems—the FPGA handles pixel-level image fusion before the merged frame is passed to RK3568 for encode and display. The RK3568's hardware video engine handles 4K encode/decode, making it practical to record full-resolution footage locally while simultaneously streaming compressed previews over the network.

In a provincial power-grid UAV inspection deployment, the dual-camera (infrared + visible) system achieved a defect-detection success rate exceeding 85% in adverse weather, and reduced manual inspection labor by 70%—translating to an annual operational saving of approximately ¥3 million.


Communication Architecture: FSPI vs. PCIe

The interface choice between FSPI and PCIe reflects a deliberate cost-versus-bandwidth trade-off:

| Interface | Latency | Bandwidth | Relative BOM Cost | |---|---|---|---| | FSPI (quad-wire SPI) | < 10 μs | ~30 MB/s (200 Mbps) | ~50% lower than PCIe | | PCIe 3.0 (×1 or ×2) | ~1–5 μs DMA | Up to 6 Gbps | Higher |

FSPI is used for sensor networks, low-bandwidth control messages, and protocol-conversion bridges (e.g., FPGA translating Modbus RTU from legacy field devices into EtherCAT frames that RK3568 can route over its Gigabit Ethernet ports). The low component count and simplified PCB routing make FSPI attractive for cost-sensitive deployments where 200 Mbps is sufficient.

PCIe DMA is reserved for high-throughput pipelines: 3D point-cloud transfer from a vision coprocessor, multi-camera GMSL aggregation, or bulk waveform dumps from the AD front-end. Machine-vision guided robotic welding lines using this path achieved end-to-end latency below 5 ms and positioning accuracy of ±0.1 mm, with reported yield improvement from 88% to 97% on an automotive body component line.


Industrial Applications in Practice

Smart Manufacturing

FPGA-driven 16-channel AD acquisition feeds real-time vibration and temperature data into NPU-based anomaly models. Separately, 3D point-cloud data from a structured-light sensor is processed by the FPGA and forwarded over PCIe to guide robotic arm placement.

Energy and Power

Substation condition monitoring uses 16-channel ADCs to capture transformer vibration and acoustic signals. The FPGA performs FFT spectral analysis on-chip, then transfers the frequency-domain results to RK3568 via FSPI. Local AI model inference identifies fault signatures with a response time below 200 ms. For UAV-based line inspection, GMSL links aggregate multi-camera video at up to 6 Gbps and stream 4K footage to RK3568 for real-time defect classification.

Intelligent Transportation

Roadside units (RSUs) for vehicle-to-infrastructure coordination use the FPGA to aggregate eight GMSL camera feeds (1080p @ 60 fps each) and forward processed frames over PCIe 3.0 to RK3568 for trajectory tracking and traffic-flow prediction. One smart-highway deployment reported a 25% reduction in accident rate and an 18% improvement in throughput. Tunnel safety systems use FSPI to relay FPGA-preprocessed smoke and flame-detection events to RK3568, which in turn actuates ventilation and alarm systems within 500 ms.

Warehousing and Logistics

Parcel-sorting lines deploy twelve GMSL cameras to capture six-face images of each package. The FPGA handles pixel preprocessing; RK3568 runs a quantized YOLOv5 model for real-time classification. Throughput reaches 3,000 parcels/hour with an error rate below 0.3%. For AGV fleet coordination, FSPI carries FPGA-computed path-planning updates to RK3568, enabling coordinated scheduling of up to 50 vehicles with a reported 35% improvement in warehouse throughput.


Environmental Hardening and Domestic Software Stack

The combined hardware platform is rated for −20°C to +70°C operation and carries EMC certification for industrial electromagnetic environments. System MTBF is specified at 99.99% availability.

On the software side, the solution supports UOS (UnionTech OS) and Kylin OS—China's domestically developed Linux distributions—satisfying industrial sectors that require a fully domestic supply chain (芯片国产化). The NPU runtime supports model deployment from TensorFlow and PyTorch, with a cloud-edge co-training pipeline that reportedly pushes fault-prediction accuracy to 95% through incremental model updates pushed from a central training cluster.


Power and Cost Efficiency

The entire platform consumes less than 10 W under typical industrial loads. Dynamic frequency scaling on the RK3568 reduces idle power further. Compared to x86-based control computers performing equivalent workloads, the RK3568+FPGA architecture delivers approximately 40% lower energy consumption—a meaningful figure in always-on embedded deployments.

The FPGA's reconfigurability is a secondary cost lever: the same hardware can be re-flashed with a different logic image to serve a different application vertical (power inspection vs. production-line QA), reducing the recurring cost of hardware redesign across product variants.


Summary

The RK3568+FPGA platform succeeds because it maps each class of work to the right substrate: Linux and the NPU handle scheduling, networking, and AI inference; the FPGA handles deterministic timing, custom protocol bridging, and high-channel-count analog front-ends; PCIe DMA and FSPI knit the two together with latencies from microseconds to tens of microseconds depending on payload size. The result is an entirely domestic-sourced solution that meets the industrial trifecta of low latency, high reliability, and broad interfacing capability at a power budget and unit cost that x86 platforms cannot match.