Version: 2.0.0

NLU Pipeline

This section will introduce: the process of NLU Pipeline and how to customize policies.

The NLU Pipeline is strongly related to the agent language, so the process of NLU also varies with agent language.
The following is the process of different language models in the NLU Pipeline, including some parameters that can be customized.

English Model
Chinese Model

language: en
pipeline:
- name: wulai_nlu.preprocess.Preprocess
  case_conversion: ''      # case conversion ('lower'|'upper'|'')
  correction: false        # spelling error correction (true|false)
- name: SpacyNLP           # lemmatization
  model: en_core_web_md
- name: SpacyTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
  catelog:                 # list of system entities
  - sys.date
  - sys.time
  - sys.city
    policy: 1              # policy for disambiguation
                           # 0: Return all
                           # 1: For the values belong to the same entity, return the longest one
                           # 2: No matter the values belong to the same entity or not, return the longest one
  case_sensitive: false    # case-sensitive (false|true)
  max_entities: 200        # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
  threshold_entity: 0.6  
  threshold_intent: 0.5
  threshold_ngram: 0.8
  template_only: true      # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
  ambiguity_threshold: -1  # the minimum score gap of top1&2
  threshold: 0.7           # fallback score

language: zh
pipeline:
- name: wulai_nlu.preprocess.Preprocess
  case_conversion: ''      # case conversion ('lower'|'upper'|'')
  to_simplified: false     # translate traditional Chinese to simplified Chinese (true|false)
- name: JiebaTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
  catelog:                 # list of system entities
  - sys.date
  - sys.time
  - sys.province
  - sys.city
  - sys.any
    policy: 1              # policy for disambiguation
                           # 0: Return all
                           # 1: For the values belong to the same entity, return the longest one
                           # 2: No matter the values belong to the same entity or not, return the longest one
  case_sensitive: false    # case-sensitive (false|true)
  max_entities: 200        # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
  threshold_entity: 0.6  
  threshold_intent: 0.5
  threshold_ngram: 0.8
  template_only: true      # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
  ambiguity_threshold: -1  # the minimum score gap of top1&2
  threshold: 0.7           # fallback score

To customize the NLU Pipeline, paste the code into: Agent Settings - "Annotation" field.