Intelligent Video Surveillance Design Scheme Based on FPGA for Pedestrian Detection

Project implementation background and feasibility analysis:

Intelligent video surveillance is an important application field of computer vision, and has broad application prospects, especially those that are sensitive to security requirements, such as airports, subway stations, banks, supermarkets, etc.

Because the pedestrians appearing in the video surveillance scene are the main body of the event, the pedestrian is the main object of the intelligent video surveillance system research. The system detects, tracks and recognizes these pedestrian targets in real time, and then analyzes their movements or behaviors. Our research topic, pedestrian detection, belongs to the target classification function part of the intelligent monitoring system. Its function is to distinguish pedestrians in the video from other objects and accurately locate them. The performance of the detection will directly affect the subsequent work and the performance of the entire intelligent monitoring system. Therefore, this will be a project with practical application significance.

Pedestrian detection in video images can be transformed into a target classification problem. The premise of designing a pedestrian detection system is the training of the classifier. The main steps in training the classifier include selection of target features and selection of machine learning algorithms. It is more convenient and faster to use the feature to model the target than to directly use the pixel value of the image. In addition, feature extraction is beneficial to reduce the intra-class distance of similar target objects, and increase the inter-class distance between different types of objects. The final classification results are more accurate. At present, rectangular features are often used to express the edge information of pedestrian shapes. The size and position of such features in the sample image are variable. In the case of pixel-by-pixel traversal, the total number of features is still very large, so a feature selection is needed. The algorithm chooses the most characteristic features for classification, and the cascaded Adaboost algorithm is having this function. The cascaded Adaboost algorithm is mature, fast, and has strong practicability. In summary, the mature pedestrian detection algorithm has laid a solid theoretical foundation for this research project.

Project implementation plan:

1 Basic block diagram and description of the project:

From a practical point of view, the pedestrian detection system involved in this paper has three functional modules. The system block diagram is shown in Figure 1. The first part is to extract the foreground target through background modeling. The second part completes the multi-scale detection function of the pedestrian target. The pedestrian classifier is trained by the cascaded Adaboost algorithm using the rectangular feature; the third part The function is to combine multiple detection results generated by the same target to complete the final detection and positioning function.

Intelligent Video Surveillance Design Scheme Based on FPGA for Pedestrian Detection

Figure 1: Block diagram of the pedestrian detection system

2 Background modeling

In a real video surveillance scene, there are a large number of background areas. The system introduces background modeling to extract foreground targets, which is beneficial to reduce the search range of the target, thereby speeding up the detection. The system uses a single Gaussian background modeling method to obtain foreground targets.

The single Gaussian distribution background model establishes a model represented by a single Gaussian distribution for the gray value distribution of each image pixel. , where the subscript t represents the frame number, with The matrix and variance of the Gaussian distribution are respectively represented. Set the gray value of the current pixel of the image. If Where T is the probability threshold, then the pixel is judged as the former attraction, otherwise it is the background point. In practical applications, remember , the probability threshold is taken .

For the update of the single Gaussian distribution model, that is, the update of the Gaussian distribution parameters of each image point, we introduce the following update formula:

among them For the update rate, The value plays a key role in the extraction of foreground targets. in case Too small will keep the background model from updating the actual background; if Too large, it is possible to update a slower moving object as part of the background model. This system will The value is 0.005.

By Gaussian background modeling, foreground pixel points of the current frame image can be generated, the foreground pixel points are marked as 1 and the background pixel points are marked as 0, and a foreground mark image is generated. In the subsequent multi-scale detection process, it is determined whether or not the detection is performed by determining whether the sub-window contains foreground pixel points. The traversal of the child window does not take much time, but the feature calculation of the cascade classifier is very time consuming, so doing so can greatly reduce the detection time.

3 multi-scale detection

The detection method adopted by the system is to traverse the image region on a pixel-by-pixel multi-scale by using the detection window, and use the trained cascaded Adaboost pedestrian classifier to detect whether there is a pedestrian in the detection window, wherein the size of the detection window is equal to the size of the training sample. In an actual video scene, the size of a person changes as the distance from the camera changes, so it is necessary to consider the problem that the detection target matches the sample size. In response to this problem, the system uses a method of reducing the original image layer by layer to ensure the consistency of the detection target and the size of the detection window.

The choice of the scaling factor is also one of the factors that affect the detection effect. If the scaling factor is too low, it may cause distortion of the target shape, which may affect the detection result; however, if the scaling factor is too high, the number of times of scaling is increased, and the detection efficiency is lowered. For trade-offs, we chose 0.85 to scale the coefficients of the image layer by layer. The layer-by-layer scaling of the image needs to be performed simultaneously on the original image and the foreground target image until the size of the image is smaller than the size of the detection window.

4 merger of multiple detection windows

Since the detection method adopted by the system is a pixel-by-pixel multi-scale traversal detection, which may result in multiple detection results for the same target (as shown in FIG. 2(a)), it is necessary to combine these overlapping windows into one detection result (eg, Figure 2 (b)). In the process of merging, first determine whether the current window has enough adjacent windows. The so-called adjacent window is the area S of the intersection of the two windows R1 and R2 (shown in the shaded portion of Fig. 2(c)) and the ratio of the two window areas is greater than 0.6. If there are enough adjacent windows, keep this window and averaging this window and its adjacent window into a new window (as shown in the virtual box in Figure 2 (c); if there are not enough adjacent windows , this window is deleted as a result of the error detection.

Intelligent Video Surveillance Design Scheme Based on FPGA for Pedestrian DetectionIntelligent Video Surveillance Design Scheme Based on FPGA for Pedestrian DetectionIntelligent Video Surveillance Design Scheme Based on FPGA for Pedestrian Detection

Figure 2: (a) pre-merger (b) post-merger (c) rectangular window merging method

Project design goals:

For video images, we use the following two performance indicators to measure the performance of the detection system:

1. False alarm rate: the sum of the number of non-pedestrian windows in which all intra-frame images are misdetected/the sum of the number of all intra-frame detection windows;

2. Detection rate: sum of the number of correct pedestrian windows for all intra-frame images/sum of all intra-frame detection windows

The design goal of the system is to achieve real-time detection and achieve high detection rate and low false alarm rate under complex image background conditions.

Experimental resources:

(1) Experimental platform

FPGA development platform and corresponding JTAG debugging and development tools

The target platform is the Xilinx Virtex-V5 development platform for the following reasons: This project belongs to the field of digital communication and high-speed circuits, and can be used in the aviation and military fields. It requires a hardware processing platform with high processing speed and rich logic resources. Xilinx-V5 The platform is the latest FPGA development platform launched by Xilinx. The data processing capability of the core FPGA chip can reach 3.1G, which can realize logic functions such as serial-to-parallel conversion, clock processing and delay processing of high-speed digital signals.

(2) Test equipment:

Including DC regulated power supply, multimeter

(3) Simulation and development tools:

Including Opencv, ModelSim, Xilinx ISE, etc.

Smart Scale

Smart Scale, Digital scale, Bathroom scale

C&Q Technology (Guangzhou) Co.,Ltd. , https://www.gzcqteq.com