NLU Pipeline
This section will introduce: the process of NLU Pipeline and how to customize policies.
- The NLU Pipeline is strongly related to the agent language, so the process of NLU also varies with agent language. 
- The following is the process of different language models in the NLU Pipeline, including some parameters that can be customized. 
- English Model
- Chinese Model
language: en
pipeline:
- name: wulai_nlu.preprocess.Preprocess
  case_conversion: ''      # case conversion ('lower'|'upper'|'')
  correction: false        # spelling error correction (true|false)
- name: SpacyNLP           # lemmatization
  model: en_core_web_md
- name: SpacyTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
  catelog:                 # list of system entities
  - sys.date
  - sys.time
  - sys.city
    policy: 1              # policy for disambiguation
                           # 0: Return all
                           # 1: For the values belong to the same entity, return the longest one
                           # 2: No matter the values belong to the same entity or not, return the longest one
  case_sensitive: false    # case-sensitive (false|true)
  max_entities: 200        # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
  threshold_entity: 0.6  
  threshold_intent: 0.5
  threshold_ngram: 0.8
  template_only: true      # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
  ambiguity_threshold: -1  # the minimum score gap of top1&2
  threshold: 0.7           # fallback score
language: zh
pipeline:
- name: wulai_nlu.preprocess.Preprocess
  case_conversion: ''      # case conversion ('lower'|'upper'|'')
  to_simplified: false     # translate traditional Chinese to simplified Chinese (true|false)
- name: JiebaTokenizer
- name: wulai_nlu.custom_entity_extractor.CustomEntityExtractor
  catelog:                 # list of system entities
  - sys.date
  - sys.time
  - sys.province
  - sys.city
  - sys.any
    policy: 1              # policy for disambiguation
                           # 0: Return all
                           # 1: For the values belong to the same entity, return the longest one
                           # 2: No matter the values belong to the same entity or not, return the longest one
  case_sensitive: false    # case-sensitive (false|true)
  max_entities: 200        # An upper limit on the number of entity values returned
- name: wulai_nlu.keyword_classifier.KeywordClassifier
- name: wulai_nlu.intent_searcher.IntentSearcher
- name: wulai_nlu.example_classifier.ExampleClassifier
  threshold_entity: 0.6  
  threshold_intent: 0.5
  threshold_ngram: 0.8
  template_only: true      # matching patterns only to improves efficiency (true|false)
- name: wulai_nlu.rerank_classifier.RerankClassifier
- name: FallbackClassifier
  ambiguity_threshold: -1  # the minimum score gap of top1&2
  threshold: 0.7           # fallback score
- To customize the NLU Pipeline, paste the code into: Agent Settings - "Annotation" field.