Appendix
Concept Description
FAQ Dialogue Match the user's question to the nearest knowledge point in the knowledge base, and answer the user with the answer of the knowledge point.
Knowledge Points It consists of answers and questions. Creating a knowledge point requires one standard question, multiple similar questions and several answers.
For example, "how about Beijing South Railway Station" is a standard question, "Introduction to Beijing South Railway Station" and "Introduction to Beijing South Railway Station" are similar questions, "Beijing South Railway Station is the railway station with the largest area and the largest number of pick-up and departure trains in Beijing" is the answer. The above constitutes a knowledge point.
Out Of Scope A collection of user problems unrelated to business. Adding to the unintentional map knowledge points will improve the accuracy of agent response. When the user's question matches the knowledge points of the unintentional map, the agent will response with fallback response.
For example, in the scene of "Beijing south railway station ticket selling agent", the question of "which ticket outlets are there in Hongqiao Railway Station" is irrelevant to the scene, and it is suitable to be added to the knowledge points of unintentional diagram.
Fallback Response The default response when the agent cannot understand the user's question. In the system, the strategy and answer of fallback response can be configured.
FAQ Recall Rate The proportion of the number of agent response (excluding the fallback response) to the total number of users' messages. A higher recall rate means that more user problems are replied by agent response.
For example, in the total number of 100 messages from users, 10 response from fallback and 90 answers from agent response to relevant knowledge points, then the recall rate is 90/100=90%.
FAQ Accuracy The proportion of correct response of agent to the number of response of agent (excluding fallback response) means that agent has a higher probability to correctly answer user questions.
For example, in the total number of 100 messages from users, 10 response from fallback, 90 answers to relevant knowledge points from agent response, of which 72 are correct response, then the accuracy rate is 72/ (100-10) =80%.
Task Dialogue Business scenarios with clear Process can be implemented through single or multiple rounds of Task dialogues. A Task dialogue consists of one or more intent.
For example, in the ticket booking business scenario, the Process of "ticket booking", "ticket query" and "ticket refund" are clear and standardized, and can be realized through Task dialogue.
Intent In a business scenario, a user wants a agent to fulfill a certain requirement. In the system, intent is composed of a Trigger and several dialogue units with jump relations.
For example, "book a ticket", "check a ticket" and "return a ticket" are the three intent in the Task dialogue of booking a ticket.
Slot Used to store key information of the Task dialog. The trend of intent Process is driven by the filling of slot and the judgment based on slot.
For example, in the intent of "booking air tickets", you need to use the "departure city" slot to store the departure city information provided by the user.
Entity That is, specific information elements. In order to identify relevant information from a user's sentence, you need to let the slot reference the entity.
For example, "city" entity, the entity value contains "Beijing", "Shanghai" and other city names, then in the intent of "booking tickets", the "departure city" slot refers to the "city" entity, and agent can recognize the city name from the user message.
Trigger The trigger condition of intent, which solves the scenario under which to start this Task, can be triggered by key phrases or similar problems.
For example, the Trigger of the intent of "activity enrollment" can be the keyword "enrollment", or the similar question "how to enroll for an activity?" "How do I sign up for the event?" Wait.
Sentence Tagging When multiple triggering intent conditions are met in a sentence, the triggering priority of a certain intent can be improved by creating a sentence annotation.
For example, in the ticket booking business scenario, if a customer asks the question "how to return a ticket after booking a ticket", the two intent of "booking a ticket" and "returning a ticket" are not easy for agent to recognize, so it is necessary to first trigger the intent of "returning a ticket" by marking "how to return a ticket" in the sentence pattern, and so on.
Domain Vocabulary A word that is unique in a business field and is not common at ordinary times. Add domain vocabulary and their synonyms, and agent will recognize these words as a whole.
For example: "ZHUGE Liang", "Himalayas", "nine-year compulsory education", etc.
System Entity The system provides pre trained system system entity, which can be used quickly.
For example, the system entity "mobile number" supports the identification and extraction of domestic 11 digit mobile numbers.
Enumerate Entity Refers to the entity whose values can be enumerated. A enumerate entity can have multiple entity values.
For example, the entity value of the "fruit" entity can be set to "apple", "banana", "peach", etc.
Confidence It is an indicator of the correlation between agent recall knowledge points or intent and user problems. The higher the confidence, the more relevant it is. The confidence is a decimal between 0 and 1.
For example, the user asked, "how about Beijing south railway station?", Agent recalls knowledge point a "how about Beijing south railway station?" The confidence is 0.95, knowledge point B "how about Beijing west railway station?" If the confidence of is 0.76, knowledge point a is more relevant to user problems.
Threshold Refers to the boundary of confidence interval. Set a threshold. When the agent matches the user's question, the confidence of recalling the knowledge point is higher than this threshold, and then response to the answer of this knowledge point.
For example, the user asked, "how about Beijing south railway station?", Agent recalls knowledge point a "how about Beijing south railway station?" The confidence is 0.95, knowledge point B "how about Beijing west railway station?" The confidence is 0.76, and the confidence of knowledge point C "introduce Beijing South Railway Station" is 0.83. Assuming that the threshold is set to 0.8, knowledge points a and C are recalled.
User Events That is, the action triggered by the user. Through events, the user's actions can be associated with the agent.
For example, when a user opens a program, the agent response with a message as a welcome message.
Debug Agent Located in the lower right corner of the agent page, you can test the test BOT effect by debugging the agent function.
Dialogue Unit The basic elements of intent. In the system, a variety of dialogue units are built in, which is equivalent to different types of "building blocks"; The dialogue units are organized according to the business process to form a intent, which is equivalent to assembling these "building blocks" into a toy according to the drawings.
Request Block The agent asks the user and fills the slot according to the user's response (if the slot has been filled, it will not ask questions). The core of the unit is the query content (the query script of the agent) and the slot (the information to be obtained).
For example: ask the user's name, mobile phone number, etc.
Inform Block Messages can be sent to users, and one or more messages can be sent. It is mainly used at the beginning and end of a intent.
Expect Block Continuously listen to messages sent by users in the current intent Process. If the information that can be filled into the associated slot is obtained, fill the slot, and branch and jump in the current cell according to the slot value; If not, do not actively ask the user.
For example, if the user says "I want to recharge", and the "traffic" in the slot is not filled, the default is "recharge"; When the user says "charging flow", the slot "flow" is filled, and the user jumps to "charging flow".
Compute Block Operate on the slot. The operation modes include: resetting the slot to a specified value or clearing (reset mode), assigning values of text or other slot to slot (assignment mode), and adding, subtracting, multiplying and dividing values in slot (calculation mode).
Collect Block Through this unit, you can save the slot data obtained in the dialogue as a record, which can be viewed on the "data analysis - user feedback - Task information collection" page.
Read Table Block Find the corresponding content in the table according to the specified conditions and store it in the associated slot.
End Block Indicates that the current intent has been completed. While completing, you can choose to end directly or jump from the current intent to the specified intent or the previous intent.
Intent Clear Message User messages are sentences or phrases with complete semantics and express clear meaning
Intent Unknown Message Definition 1: ambiguity. There are more than two possibilities for understanding. It is necessary to confirm the intent with the customer according to the business scenario Definition 2: the reference is unknown, and it is impossible to determine what it refers to out of context
Invalid Message
- User messages have no meaning, such as: HA / ah
- Random code, Other language, etc
- Messages that cannot be read due to voice to text. Such as: you, uh huh, how oh
Note: Messages with unknown intent and invalid messages can belong to semantic free knowledge points.
Learning Of Example (Learning Of New Knowledge Points) The confidence between the online user message or test message and the existing message in the knowledge base is about 0.4-0.75 (the default value), which will enter the example. It is often used as the source of new similarity questions and new knowledge points under existing knowledge points.
Excavate Before and after going online, the corpus is cleaned and imported into data mining. It is often used to mine knowledge points. [applicable to items with corpus]
Fallback When the agent can't accurately response to the user, the unified answer when replying
Training Set It is suitable for corpus items to train the effective message set of the model
Test Set
- Definition 1: a corpus set with corpus items randomly selected from historical corpus to verify the effect of the model. There is no overlap between the test set and the training set. [the number of test sets shall be at least 3-5 times of the number of knowledge points, that is, each knowledge point has an average of 3-5 test messages]
- Definition 2: no corpus item, a corpus set used to verify the effect of the model in the corpus collected by crowdsourcing. There is no overlap between the test set and the training set.
External Test Use any message within the coverage of the knowledge base to simulate user questions for batch testing, and calculate the quasi call rate of top n under different thresholds (the number of test sets is at least 3-5 times the number of knowledge points, that is, there are 3-5 test messages per knowledge point on average)