Recently, Baidu has made another breakthrough in speech recognition technology, successfully "cross-border" image recognition technology into the field of speech, using Deep Convolutional Neural Network (Deep CNN) in the acoustic modeling of speech recognition, which is based on long and short time The combination of the memory unit (LSTM) and the connection timing classification (CTC) end-to-end speech recognition technology reduces the error rate by 10%, which greatly improves the performance of speech recognition products. It is another major achievement after end-to-end speech recognition. Technical breakthrough.
Deep CNN speech recognition modeling process
In recent years, the achievements of image recognition using CNN technology are quite rich. More and more CNNs are constantly refreshing the accuracy of image recognition. Taking face recognition as an example, the recognition accuracy rate is as high as 99.7%. However, the progress of CNN has not been fully applied in speech recognition. As an artificial intelligence company with in-depth research on speech technology, Baidu regards Deep CNN as the next breakthrough in speech recognition technology.
In the ImageNet competition, the deeper CNN keeps refreshing its performance
In the commercial end-to-end speech recognition technology, Baidu first tried to introduce a deeper CNN neural network, which reduced the error rate by 10%. End-to-end technology uses a single learning algorithm to complete all processes from the task input to the output, reducing intermediate units and human intervention, and the model effect is significantly improved with the support of massive data. At present, Baidu's end-to-end technology is at an industry-leading level. It is worth mentioning that speech recognition is done based on the speech spectrum after time-frequency analysis. Using the time-frequency spectrum obtained by analyzing the entire speech signal as an image, you can use the widely used CNN in the image to identify and overcome In order to solve the problem of diversity of speech signals, and the introduction of deeper CNN, the performance of speech recognition has been significantly improved, as Dr. Li Xiangang, head of recognition technology of Baidu Speech Technology Department said: 'The Deeper, The Better'.
Unlike academic research, Baidu Voice's R & D bases its focus on the practical application of technology, and the technical difficulty and realization are higher. For voice recognition products, it must have a large-scale voice database to reflect the performance improvement and a model suitable for the operation of online voice recognition products. Baidu used thousands of hours to conduct experimental research and verified it in the product voice database of nearly 100,000 hours, and the sufficient voice data resources make the speech recognition system based on end-to-end technology significantly better than the previous framework performance.
Baidu speech recognition technology iterative algorithm model every year
In addition, Baidu voice technology has significant advantages in data, computing power, and algorithms. Baidu has about 100,000 hours of accurately labeled voice data, and a high-performance computing platform based on hundreds of GPUs. In terms of algorithms, Baidu is constantly optimizing and iterating model algorithms every year, and the speech recognition effect has been significantly improved, leading the industry.
Previously, Baidu used the end-to-end technology to develop Deep Speech 2 deep speech recognition technology to improve the accuracy of speech recognition in noisy environments. In a noisy environment, the error rate is lower than the voice systems of Google, Microsoft and Apple. At present, Baidu's speech recognition accuracy rate is as high as 97%, and it is listed as one of the top ten breakthrough technologies in 2016 by the American authoritative scientific magazine "MIT Review". According to Dr. Li Xiangang, it is indeed stepping up the research and development of Deep Speech 3, and the Deep CNN announced this time does not rule out that it will be the core component of Deep Speech 3.
In addition to technological breakthroughs, Baidu has also actively promoted the popularization of users' use of voice interaction. Mobile phone Baidu, Baidu input method, Baidu map, and Du Mi have all supported voice input functions. The "cross-border" Deep CNN believes that it will soon Applied to Baidu products with huge user volume.
There are many kinds of Electric Accessories,with different of functions,and our company's Electric Accessories are mainly composed of these series:
Nylon Cable Ties
Cable Clips
Expand Nail
Cable Marker
Self-Adhesive Cable Tie Base
Spiral Wrapping Band
Self-screw End Wire Connector
Double Wings End Wire Connector
PG Cable Gland
PGM Metallic Cable Gland
Wiring Duct(Slotted)
PZC Wiring Duct
Cable Clips,Cable Ties,Cable Marker,Spiral Wrapping Band
Ningbo Bond Industrial Electric Co., Ltd. , https://www.bondelectro.com