The invention disclosed herein concerns a
system (100) and method (600) for building a
language model representation of an NLU application. The method 500 can include categorizing an NLU
application domain (602), classifying a corpus in view of the
categorization (604), and training at least one
language model in view of the classification (606). The
categorization produces a hierarchical tree of categories, sub-categories and end targets across one or more features for interpreting one or more
natural language input requests. During development of an NLU application, a developer assigns sentences of the NLU application to categories, sub-categories or end targets across one or more features for associating each
sentence with desire interpretations. A
language model builder (140) iteratively builds
multiple language models for this
sentence data, and iteratively evaluating them against a test corpus, partitioning the data based on the
categorization and rebuilding models, so as to produce an optimal configuration of language models to interpret and respond to language input requests for the NLU application.