Tech

Smart Urban Infrastructure Management: AI-Powered Road Defect and Traffic Sign Recognition

Published on
March 13, 2025

Urban infrastructure maintenance is a critical challenge for cities worldwide. Streets deteriorate due to heavy traffic, weather conditions, and the natural wear and tear of time. Keeping an up-to-date record of road conditions, street signs, and road markings is essential for efficient traffic management and urban planning. Automating this detection process with AI-driven computer vision enhances efficiency and accuracy compared to traditional manual inspections.

This project aimed to develop a system capable of detecting and mapping street defects, road paint markings, and traffic signs using artificial intelligence. The implementation followed two approaches: an offline processing method, where data was captured and later processed on a server, and a real-time edge computing approach, where the entire processing pipeline ran live on a Qualcomm RB5 device. Both approaches relied on a multi-threaded architecture to efficiently handle the different stages of data collection, processing, and analysis.

Offline Processing: Batch-Based Street Analysis

The first phase of the project focused on capturing video and GPS data using an RGB camera and GPS module mounted on a vehicle. As the vehicle moved through urban areas, it continuously recorded high-resolution video footage while logging location data. Multi-threading was implemented to synchronize video frames with their respective GPS timestamps, ensuring that each detected object could be mapped accurately.

The diagram below illustrates the multi-threaded system and its outputs, which are later used as inputs in the offline processing pipeline to synchronize video frames with GPS data. 

Once the data collection was complete, it was transferred to a high-performance server for processing. The pipeline began with preprocessing, which involved extracting frames from video and aligning the frames with GPS logs. The next step was object detection, where an object detection algorithm was used to identify street defects, road markings, and traffic signs. Post-processing was applied to track objects and eliminate duplicate detections, ensuring a clean and accurate dataset. Finally, the processed information was converted into GeoJSON format, making it compatible with GIS applications for visualization and analysis.

The following diagram provides an overview of the processing pipeline, from frame extraction to object detection and final geospatial data storage.

The image below shows a processed route marked with red dots representing the GPS samples, displayed in QGIS, showing all detected traffic signs on a map layer. Each detection is represented by an icon corresponding to the identified sign. Additionally, when hovering over a detection, a pop-up appears containing the class information, detection confidence score, and an image of the detected sign, providing an interactive and detailed view of the processed data.

In the case of pavement defects, the image below presents a vehicle-mounted camera view capturing pavement defect detections. Each detected defect is highlighted with a bounding box, displaying the type of defect and its associated ID for tracking and reference. This visualization provides an on-the-ground perspective of the detection process, showcasing how AI identifies and classifies pavement issues directly from the camera feed.

This approach offered several advantages, such as the ability to run complex models without real-time constraints and the flexibility to fine-tune and reprocess data as needed. However, it had its limitations, primarily the delay in obtaining insights, as all data had to be transferred and processed before being usable. Additionally, dependency on a high-performance server meant increased infrastructure costs and logistical challenges in scaling the solution.

Real-Time Edge Processing on Qualcomm RB5

In the second phase, the goal was to eliminate the delay by running the entire pipeline live on a Qualcomm RB5 edge device. The same RGB camera and GPS setup were used, but everything was processed in real time instead of sending the data to a server.

To achieve this, a multi-threaded architecture was designed to manage different stages of processing efficiently. One thread handled GPS data reading and queuing, another captured and synchronized video frames, while additional threads focused on preprocessing images, running the object detection inferences on the Adreno GPU, and implementing object tracking to filter out duplicate detections. The final step formatted the data into GeoJSON and transmitted it for real-time mapping and analysis.

The diagram below illustrates the multi-threaded architecture used to efficiently process video, GPS, and AI inference on the Qualcomm RB5 device in real time.

Deploying AI models on an edge device came with challenges, particularly in optimizing the system for low latency and efficient memory management. Running inference on the GPU instead of the CPU significantly reduced processing time, while careful memory management helped avoid bottlenecks. To achieve near real-time performance, it was necessary to retrain the model, opting for a lighter version of the object detection algorithm with a lower frame resolution. Additionally, the final model was further optimized to balance accuracy and speed, ensuring efficient processing within the edge device's constraints. 

By moving from batch processing to real-time execution, the system was able to provide instant insights, eliminating the need to wait for post-processing. This not only improved operational efficiency but also reduced infrastructure costs by removing dependency on cloud or on-premise servers. Moreover, real-time processing enabled better scalability, allowing multiple vehicles to operate independently without requiring high-bandwidth data transmission.

Key Takeaways and Future Improvements

This project highlighted the trade-offs between offline batch processing and real-time edge computing. While batch processing allows for more detailed analysis and fine-tuning, real-time processing provides immediate feedback, making it a more practical solution for large-scale deployments in smart cities.

Future improvements could focus on optimizing the AI model further for enhanced efficiency, integrating additional sensors such as LiDAR for more precise defect detection, and refining GPS accuracy. This technology has immense potential in automated infrastructure monitoring, paving the way for AI-driven urban planning and road maintenance strategies.

Another key takeaway is that while the system currently utilizes the CPU for coordination and queue management and the GPU for object detection inferences, other powerful hardware components remain underutilized. Specifically, the Hexagon DSP, HTA (Hexagon Tensor Accelerator), and HTP (Hexagon Tensor Processor/NPU) offer untapped computational power that could significantly enhance performance. The DSP is well-suited for real-time signal processing and image transformations, making it ideal for offloading preprocessing tasks such as resizing and format conversion. The HTA can accelerate tensor operations, while the HTP (NPU) is optimized for low-power deep learning inference, making it an excellent target for tracking models. By leveraging these specialized processing units, the system can achieve higher efficiency, lower latency, and reduced CPU/GPU load, enabling more advanced real-time AI applications in future iterations.

Conclusion

Through the implementation of AI-powered computer vision, this project successfully demonstrated how automated street defect detection and mapping can be achieved. The transition from an offline batch processing system to a real-time edge computing solution on the Qualcomm RB5 showcases the increasing feasibility of AI at the edge. If interested, we invite you to have a look at our press release Digital Sense Utilizes Qualcomm Dragonwing RB5 Platform to Enhance Urban Infrastructure Monitoring” done with Qualcomm on this project. 

Moving forward, advancements in model optimization, hardware efficiency, and sensor integration will further enhance the scalability and accuracy of this technology, making it a powerful tool for the future of smart cities and automated infrastructure monitoring. Explore more about our computer vision services or contact us.