The sixth generation of Xiao Bing is online. Why did Microsoft spend four years exploring emotional AI?

At 4:23 pm on July 26, the Microsoft artificial intelligence robot named "Xiao Bing" sent a message in the WeChat public account "I have upgraded to the sixth generation of Xiao Bing."

At the same time, Microsoft's global executive vice president Shen Xiangyang officially announced in Beijing: the sixth generation of small ice on the line. This is also the largest upgrade in history since the birth of Microsoft Xiaobing in 2014. Whether it's the underlying emotional computing framework or the externally interactive 3D look - a new "small ice" is here.

At the press conference, Microsoft revealed for the first time the number of users Xiao Xiao has in the world: 660 million.

In addition to WeChat, the existence of Xiao Bing has spread to the back end of Line, QQ, Xiaomi ecological chain, Netease cloud music, Huawei mobile phone and other products. The product form involves dialogue artificial intelligence robot, intelligent voice assistant, artificial intelligence to create content providers, etc. Vertical domain solutions covering more than 40 platforms in five countries around the world.

Looking back on Xiaobing's four-year development, he became a WeChat group chat assistant when he was born in 2014, providing information search services such as weather, traffic, and constellation. By 2015, Xiao Bing's third-generation version added "speaking ability". "Hearing", at the same time, has the voice of a small ice personality; and in July, Microsoft released the "Shalu Little Ice Model" program, giving Xiao Bing a self-learning ability for vocals.



The sixth generation of Xiao Bing (pictured right) has a new 3D form factor that can be interactive.

It can be said that from the personal assistant of the machine to the emotional AI robot, the boundary between Xiao Bing and people has become increasingly blurred.

In the field of artificial intelligence, the popular interpretation of NLP (Natural Language Processing) is "helping the machine to understand the human language and give people a response." As a centralized presentation product of Microsoft in the fields of speech recognition and semantic interaction, Xiao Bing has always paid attention to let Xiao Bing have the ability to analyze emotions (EQ) in addition to learning communication (IQ).

Today, "emotional AI" is becoming a feature of Microsoft's small ice that distinguishes it from most intelligent robots on the market. In general, it is not difficult for AI companies to develop an AI robot that can automatically reply by accumulating corpus, annotating data, and building a knowledge map. However, it is not difficult to let the robot add emotion, and the simulated person's tone provides accurate response to the user. The challenge in the field of artificial intelligence.

Microsoft Dual AI does not regard "openness" as the main theme
In order to let the AI ​​robot generate human response, Microsoft has established relevant technical thresholds in Xiao Bing's branch functions such as chatting and singing.

From the perspective of voice interaction, Xiao Bing launched a new generation of voice interaction technology when it was released in the fifth generation last year: Full-duplex Voice Sense.

Take most of the smart speakers on the market as an example. Every time you issue a command, you need to say a wake-up word. For example, when users use Baidu smart speakers, they need to say: small degree, help me put a song. Smallness and small, the volume is enlarged. After using the full-duplex technology, the user only needs to wake up the device for the first time (that is, just say "small degree"), and then realize continuous dialogue.

According to Microsoft, the longest record of users and Xiao Bing chats for more than 4 hours, a total of more than 1,600 sentences.

On the day of the conference, Xiao Bing also performed the Tengger version of Invisible Wings on the spot, using Microsoft's latest version of the fourth version of the artificial intelligence song DNN model.

This model is also known as the "Shalu Little Ice Model" within Microsoft. Among them, the name "Shalu" originated from the English Cell, meaning that it can be copied and mutated like a cell. According to Microsoft's introduction, using Shalu's Xiaobing model, Xiao Bing can integrate human acquired techniques in his own voice, such as imitating Tengger's singing characteristics.

Xiaobing song band diagram, the yellow circle position is Xiaobing "sound"

It is worth noting that this model can be applied not only to singing. According to Microsoft, a small Xiaobing studio has been set up to systematically model the creative abilities of human poetry, lyrics, composition, painting, etc., and to use advanced learning techniques to imitate human creativity, and finally let the machine Gain the ability to create yourself.

As the boundary between Xiao Bing and people becomes more and more blurred, the proposition about user privacy is placed in front of Microsoft. In addition, the abuse of functions such as voice imitation in telecom fraud and other scenarios requires Microsoft to remain vigilant in decision-making.

"We will isolate Xiao Bing's general framework model from the commercialization process, and will split some vertical areas to create AI robots that meet industry needs for different fields and conduct separate commercialization work." Microsoft Artificial Intelligence Business Cao Wenzhao, general manager of the business department, said.

In order to better serve these partners, Microsoft also released another heavy news: the establishment of the Dual AI ecosystem platform.

In general, domestic AI companies will adopt two methods when they build platforms: one is complete open empowerment, and the ecosystem is built by providing developers with SDK/API forms. For example, Baidu will use its underlying "Baidu brain." More than 100 kinds of AI functions have been opened up, allowing developers to develop new applications based on the DuerOS-based ecosystem.


In July of this year, Baidu released “Baidu Brain 3.0” and opened 110 AI capabilities.

The other is to focus on its own, closed platform, generally in the form of "app store" to build an ecosystem, such as the App Store, this centralized approach can easily lead to traffic imbalance, but also limit the AI ​​application itself Quick iteration.

"Whether it is open or closed, the relationship between these two types of empowerment and developers is too loose, no one is responsible for the final product experience." Peng Shuang, head of Microsoft Xiaobing products, said. In addition, because the API and SDK emphasize versatility, it is impossible to apply the latest technical capabilities and the highest quality data to the API.

Therefore, unlike the AI ​​platform such as Baidu DuerOS, Microsoft Dual AI does not regard "opening" as the main theme. Developers cannot use Microsoft's "Little Ice" products in API/SDK mode like Baidu's AI interface. Voice interaction, NLP and other capabilities.

According to Microsoft's official disclosure, the Dual AI strategy is divided into three parts:

First, Microsoft provides Xiao Bing's overall framework capabilities to help the partner platform's own AI.

Secondly, Microsoft Xiao Bing is the auxiliary AI of the cooperation platform and integrates into the platform ecosystem.

Third, Microsoft has launched cooperative applications and products around the differentiated features of the cooperation platform through technology, products and operations.

"The development of AI is inseparable from data, but we must emphasize the user experience and data security, and Dual AI forms an ecological environment of circular data, guiding us to cooperate with third-party partners." Shen Xiangyang said.

Xiaobing has higher quality interaction and paves the way for commercialization
For the EQ capabilities of AI robots, Li Di, the head of Microsoft Xiaobing, known as the "Father of Xiao Bing", likes to give the media an example: a colleague twisted his foot and sent a photo of a sprained foot to Xiao Bing. Xiao Bing's reply is, are you seriously injured?

This reply contains two abilities of Xiao Bing: one is image recognition. Xiao Bing needs to have the function of detecting the human body parts, recognize the "ankles" in the image; the other is to get the emotional expression of concern and comfort similar to human beings after getting the message of "injured" by the chatter.

This deep emotional feedback is the form of AI robot in Li Di's ideal. In the past interviews, Li Di has expressed dissatisfaction with the current artificial intelligence products on the market.

"If an AI system is just answering questions and completing tasks, why don't users use search engines and mobile apps that they are used to?" Li Di said.

On the other hand, when Xiao Bing has a more human question and answer function, the quality of her interaction with the user will also improve.

Li Di talked about such an example: When you say "help me order a McDonald'" to Xiao Bing, the AI ​​robot will usually help you place an order, but there will be another answer to this question: when the AI ​​discovers the user for a long time. After unhealthy behavior, you will be denied a request for junk food.


Microsoft Xiaobing leader Li Di

Although the taste of being rejected is not good, this kind of humanized AI form will leave a deeper impression on users. Some users will therefore use Xiao Bing as a trustworthy partner and carry out with Xiao Bing. High-quality interaction, which undoubtedly laid the foundation for the commercialization of Xiao Bing.

At the end of August 2017, Rinna of Japan cooperated with Rosen convenience store to send coupons to users. Over one million users received coupons in one day. According to Rosen statistics, each coupon was averaged. Can bring nearly 20 yuan in profits to the store.

“The reason why Xiao Bing can sell so many coupons is because she is more like a person when interacting with users.” Li Di once told the media that through simple chat, Xiao Bing can guide users to interest in coupons. Finally, I took the initiative to ask for coupons from Xiao Bing.

In order to make the commercial scene of Xiao Bing more systematic. On July 26, Microsoft also announced the first four commercial areas of Xiao Bing today: finance, mass culture, media and publishing.

Among them, the small ice financial text generation technology in the financial field, in cooperation with Wind Info and Wall Street, has covered about 90% of domestic financial institutions, 75% of approved qualified overseas investment institutions and about 40% of domestic individual investors.

In the field of popular culture, Xiaobing’s children have automatic production of audiobooks and have received more than 4 million hours of listening. The “Small Ice Sister Storytelling” audiobook has covered more than 90% of children’s early childhood education robots and 80% of online listening. platform.

In the field of television broadcasting, Xiao Bing participated in the production and hosting of TV radio programs through artificial intelligence technology, and has reached 21 TV programs and 28 TV programs.

In addition, Xiao Bing has also combined with Microsoft Bing search engine technology to launch an auxiliary solution for the media and publishing two vertical industries, which has now reached more than 15 media platforms. The media and self-media public number provided by Xiao Bing with artificial intelligence technology has exceeded 60,000.

Follow Me
Link:Tenco

                                                             ——END——

评论

此博客中的热门博文

RoboMaster Ends: Very Cool Robot Design Competition

The microphone alarm clock that Dilraba got out of bed, powerful and intelligent