NLU Pipeline
This section will introduce: the process of NLU Pipeline and how to customize policies.
The NLU Pipeline is strongly related to the agent language, so the process of NLU also varies with agent language.
The following is the process of different language models in the NLU Pipeline, including some parameters that can be customized.
- English Model
- Chinese Model
language: en
pipeline:
- name: wulai_nlu.preprocess.Preprocess
case_conversion: '' # case conversion ('lower'|'upper'|'')
correction: false # spelling error correction (true|false)
- name: SpacyNLP # lemmatization
model: en_core_web_md
- name: SpacyTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
catelog: # list of system entities
- sys.date
- sys.time
- sys.city
policy: 1 # policy for disambiguation
# 0: Return all
# 1: For the values belong to the same entity, return the longest one
# 2: No matter the values belong to the same entity or not, return the longest one
case_sensitive: false # case-sensitive (false|true)
max_entities: 200 # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
threshold_entity: 0.6
threshold_intent: 0.5
threshold_ngram: 0.8
template_only: true # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
ambiguity_threshold: -1 # the minimum score gap of top1&2
threshold: 0.7 # fallback score
language: zh
pipeline:
- name: wulai_nlu.preprocess.Preprocess
case_conversion: '' # case conversion ('lower'|'upper'|'')
to_simplified: false # translate traditional Chinese to simplified Chinese (true|false)
- name: JiebaTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
catelog: # list of system entities
- sys.date
- sys.time
- sys.province
- sys.city
- sys.any
policy: 1 # policy for disambiguation
# 0: Return all
# 1: For the values belong to the same entity, return the longest one
# 2: No matter the values belong to the same entity or not, return the longest one
case_sensitive: false # case-sensitive (false|true)
max_entities: 200 # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
threshold_entity: 0.6
threshold_intent: 0.5
threshold_ngram: 0.8
template_only: true # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
ambiguity_threshold: -1 # the minimum score gap of top1&2
threshold: 0.7 # fallback score
- To customize the NLU Pipeline, paste the code into: Agent Settings - "Annotation" field.