Method, device and electronic equipment for extracting and analyzing text key information of a work order system

By extracting key information from the work order system and performing cluster analysis, the problem of product managers being unable to efficiently understand user needs was solved, enabling precise product optimization and improved work efficiency.

CN115965007BActive Publication Date: 2026-06-30AISINO CORPORATION

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AISINO CORPORATION
Filing Date
2022-12-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In the existing work order system, product managers are unable to efficiently and accurately understand users' key issues and needs, resulting in low product optimization efficiency.

Method used

By statistically analyzing the work order information in the training set, defining commonly used sentence templates and part-of-speech path templates, and combining stop words and invoice business-specific dictionaries, text segmentation and word segmentation are performed to extract keyword information. Word frequency statistics and cluster analysis are then conducted to generate a list of issues and requirements for product optimization.

Benefits of technology

This enabled precise optimization of product managers, improved their work efficiency, and allowed them to understand users' key issues and needs in a timely manner.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115965007B_ABST
    Figure CN115965007B_ABST
Patent Text Reader

Abstract

This invention discloses a method, apparatus, and electronic device for extracting and analyzing key information from text in a work order system. The method includes: statistically analyzing work order information in a training set and defining commonly used sentence templates and part-of-speech path templates; defining stop words and a dictionary specifically for invoice business for word segmentation; traversing the text to be analyzed in the work order system, matching the text with the commonly used sentence templates to form text segments; segmenting the text segments according to the part-of-speech path templates, combined with the stop words and the dictionary specifically for invoice business, and extracting keyword information; performing word frequency statistics and cluster analysis on the extracted keyword information to generate a list of issues and requirements to be optimized in the product. This invention helps product personnel understand users' key issues and expected needs, thereby achieving precise product optimization and improving work efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data processing, and more specifically, relates to a method, apparatus, and electronic device for extracting and analyzing key information from work order system text. Background Technology

[0002] For product optimization and improvement, understanding user pain points and challenges is a crucial step. A ticketing system is an important way to identify product issues. Existing ticketing systems have relatively rigorous processing procedures, involving roles such as ticket submitters, operations and maintenance (O&M) technical support personnel, and product managers. Ticket types in the system include requirements, suggestions, and defects. Some are handled routinely by O&M or O&M personnel without further problem analysis; others are submitted as requirements to product managers and other personnel to provide product optimization suggestions, aiming to uncover the true problems or user intent.

[0003] Currently, product managers and other personnel typically rely on manual analysis when faced with large amounts of work order data. This results in low work efficiency and often fails to provide timely and accurate insights into users' key issues and needs, making it difficult to achieve precise product optimization. Summary of the Invention

[0004] The purpose of this invention is to provide a method, apparatus, and electronic device for extracting and analyzing key information from work order system texts, thereby helping product personnel understand users' key issues and expectations, and thus enabling precise product optimization and improved work efficiency.

[0005] Firstly, this invention proposes a method for extracting and analyzing key information from work order system text, including:

[0006] Perform statistical analysis on the work order information in the training set, and define commonly used sentence templates and part-of-speech path templates;

[0007] Define a stop word dictionary and a dedicated dictionary for invoice-related business, for use in word segmentation.

[0008] The text to be analyzed in the work order system is traversed, and the commonly used statement templates are used to perform statement template matching on the text to be analyzed to form text segments.

[0009] Based on the part-of-speech path template, combined with the stop words and the invoice business-specific dictionary, the text segments are segmented and keyword information is extracted.

[0010] The extracted keyword information is subjected to word frequency statistics and cluster analysis to generate a list of issues and requirements for product optimization.

[0011] Preferably, the method for defining the commonly used statement template includes:

[0012] The training set of work orders is analyzed and classified to form multiple types of classified work orders;

[0013] Extract common key identifiers from each type of work order category to generate the commonly used statement template.

[0014] Preferably, the types of classified work orders include problem type, transaction type, and requirement type.

[0015] Preferably, the format of the commonly used statement template is prefix + key identifier + suffix.

[0016] Preferably, the method for defining the part-of-speech path template includes:

[0017] For the work order information in the training set, match it according to the commonly used statement template to generate prefix words and suffix words;

[0018] Statistical analysis of keyword part-of-speech paths is performed on the prefixes and suffixes to extract the keyword part-of-speech paths of the top 3 prefixes and suffixes, forming the part-of-speech path template.

[0019] Preferably, the format of the text segmentation is prefix word + key identifier + suffix word, and the key identifier of the text segmentation is one of the key identifiers of the commonly used sentence template.

[0020] Preferably, the step of segmenting the text based on the part-of-speech path template, combined with the stop words and the invoice business-specific dictionary, and extracting keyword information includes:

[0021] The stop words and the invoice business-specific dictionary are used to segment the prefix and suffix words of the text segments;

[0022] Based on the part-of-speech path template of specific key identifiers, keywords are extracted from the prefix and suffix words of the text segments;

[0023] Repeat the above steps to extract keyword information from all text segments of the text to be analyzed.

[0024] Preferably, the step of performing word frequency statistics and cluster analysis on the extracted keyword information to generate a list of issues and requirements for product optimization includes:

[0025] Key phrase frequency statistics are performed on all extracted keywords to generate a key phrase frequency graph for product analysis.

[0026] Extract TF-IDF weights from keyword information and perform K-means clustering analysis to generate a list of the top N types of problems and requirements to be optimized for the product, where N is a positive integer.

[0027] Secondly, the present invention proposes a device for extracting and analyzing key information from work order system text, comprising:

[0028] A custom module is used to perform statistical analysis on the work order information in the training set, and to define commonly used sentence templates and part-of-speech path templates; as well as to define stop words and a dedicated dictionary for invoice business for word segmentation.

[0029] The text segmentation module is used to traverse the text to be analyzed in the work order system, and use the common statement templates to perform statement template matching on the text to be analyzed to form text segments.

[0030] The keyword extraction module is used to segment the text segments and extract keyword information based on the part-of-speech path template, the stop words, and the invoice business-specific dictionary.

[0031] The statistical analysis module is used to perform word frequency statistics and cluster analysis on the extracted keyword information to generate a list of issues and requirements for product optimization.

[0032] Thirdly, the present invention provides an electronic device, the electronic device comprising:

[0033] At least one processor; and,

[0034] A memory communicatively connected to the at least one processor; wherein,

[0035] The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method for extracting and analyzing key text information of the work order system as described in the first aspect.

[0036] The beneficial effects of this invention are as follows:

[0037] This invention first performs statistical analysis on the work order information in the training set, defines commonly used sentence templates and part-of-speech path templates, and defines stop words and a special dictionary for invoice business for word segmentation. Then, it traverses the text to be analyzed in the work order system, uses the predefined commonly used sentence templates to match the text to be analyzed, and forms text segments. Then, based on the part-of-speech path templates, combined with stop words and the special dictionary for invoice business, it performs word segmentation on the text segments and extracts keyword information. Finally, it performs word frequency statistics and cluster analysis on the extracted keyword information to generate a list of problems and requirements to be optimized in the product. This invention performs precise text analysis on the work order system, extracts key information, and performs categorized statistical analysis on the key information. The analysis results are fed back to front-line product personnel, enabling product managers to grasp the problems that urgently need to be solved and the direction of optimization, thereby achieving precise product optimization and improving work efficiency.

[0038] The system of the present invention has other features and advantages that will be apparent from or will be set forth in detail in the accompanying drawings and following detailed description, which together serve to explain the particular principles of the invention. Attached Figure Description

[0039] The above and other objects, features and advantages of the present invention will become more apparent from the accompanying drawings, in which like reference numerals generally denote like parts.

[0040] Figure 1 A flowchart illustrating the steps of a method for extracting and analyzing key information from a work order system text according to the present invention is shown. Detailed Implementation

[0041] This invention performs in-depth analysis of text information in the work order system, extracts key information and performs cluster analysis, and feeds the analysis results back to front-line product personnel. This can help product personnel understand users' key issues and expectations, thereby achieving precise product optimization and improving work efficiency.

[0042] The invention will now be described in more detail with reference to the accompanying drawings. While preferred embodiments of the invention are shown in the drawings, it should be understood that the invention can be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that the invention will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

[0043] Example 1

[0044] like Figure 1 As shown, a method for extracting and analyzing key information from a work order system text includes:

[0045] S1: Perform statistical analysis on the work order information in the training set, and define commonly used sentence templates and part-of-speech path templates;

[0046] In this step, the method for defining the commonly used statement template includes:

[0047] The training set of work orders is analyzed and classified to form multiple types of classified work orders; the types of classified work orders include problem type, transaction type and requirement type.

[0048] For each type of work order, extract common key identifiers to generate commonly used statement templates. The format of the commonly used statement templates is prefix + key identifier + suffix.

[0049] In this step, the method for defining the part-of-speech path template includes:

[0050] For the work order information in the training set, match it according to the commonly used statement template to generate prefix words and suffix words;

[0051] Statistical analysis of keyword part-of-speech paths is performed on the prefixes and suffixes to extract the keyword part-of-speech paths of the top 3 prefixes and suffixes, forming the part-of-speech path template.

[0052] In a specific application scenario, this step is the training processing phase, which specifically includes:

[0053] S101: Manually analyze and classify the work orders in the work order system training set. The work orders are classified into three categories: problem, transaction, and requirement.

[0054] S102: Manually extract common key identifiers from classified work orders and generate statement templates for subsequent loading.

[0055] S103: The generated statement template has the following unified format: [preword + key identifier type + suffix afterword]. For example: For the question category, the key identifier type is extracted as ['failure', 'error', 'wrong', 'not dropped', 'problem', 'not enough', 'too low', 'erroneous'], ["no", 'unable', 'cannot', 'not', 'none']. For the transaction category, the key identifier is extracted as ["trouble", 'need', 'help', 'can', 'need']. For the requirement category, the key identifier is extracted as ["hope", 'suggestion', 'consider', 'new requirement'].

[0056] S104: Based on the work order information in the training set, prefix and suffix words are generated according to the sentence template matching. Keyword part-of-speech path statistical analysis is performed on the prefix and suffix words to extract the top 3 prefix and suffix word keyword part-of-speech paths for subsequent loading. Common part-of-speech paths include 'nz,v' and "nz,v,n,d,vn".

[0057] S2: Defines stop words and a dictionary specifically for invoice business, for use in word segmentation;

[0058] S3: Traverse the text to be analyzed in the work order system, and use the commonly used statement templates to perform statement template matching on the text to be analyzed to form text segments;

[0059] The format of the text segmentation is prefix word + key identifier + suffix word, and the key identifier of the text segmentation is one of the key identifiers of the commonly used sentence template.

[0060] In the specific application scenario mentioned above, the training iterates through the text information of the work order system, performs sentence template matching on the text information, and forms text segments. The text segment format is [preword, type, afterword], where type is one of the key identifiers of the sentence template.

[0061] S4: Based on the part-of-speech path template, combined with the stop words and the invoice business-specific dictionary, the text segments are segmented and keyword information is extracted;

[0062] This step specifically includes:

[0063] S401: Use the stop words and the invoice business-specific dictionary to segment the prefix and suffix words of the text segments;

[0064] S402: Extract keywords from the prefix and suffix words of the text segments based on the part-of-speech path template of a specific key identifier;

[0065] S403: Repeat the above steps to extract keyword information from all text segments of the text to be analyzed.

[0066] In the specific application scenario described above, for the preword and afterword of the text segment, based on the part-of-speech path template of the specific type (key identifier) ​​extracted in step 2, the keyword KW is extracted: prekeyword + type + afterkeyword. This process is repeated to extract the keyword KW information from the text to be analyzed.

[0067] S5: Perform word frequency statistics and cluster analysis on the extracted keyword information to generate a list of issues and requirements for product optimization.

[0068] This step specifically includes:

[0069] Perform key phrase frequency statistics (i.e., the frequency of keyword occurrence) on all extracted keywords to generate a key phrase frequency chart for product analysis.

[0070] Extract TF-IDF weights from keyword information and perform K-means clustering analysis to generate a list of the top N types of problems and requirements to be optimized for the product, where N is a positive integer.

[0071] In the specific application scenarios mentioned above, key phrase frequency statistics are performed on KW information to form a key phrase frequency graph for product analysis.

[0072] Extract TF-IDF weights from KW information and perform Kmeans clustering analysis to more accurately generate a list of the top 20 categories of problems and requirements to be optimized for the product (the number of categories can be configured according to needs).

[0073] TF-IDF (term frequency–inverse document frequency) is a commonly used weighting technique for information retrieval and prospecting. TF-IDF is a statistical method used to evaluate the importance of a word to a document within a document set or corpus. A word's importance increases proportionally to its frequency of occurrence in a document, but decreases inversely proportionally to its frequency of occurrence in the corpus. The main idea behind TF-IDF is that if a word or phrase has a high TF frequency in one document but rarely appears in other documents, it is considered to have good class-discriminating ability and is suitable for classification.

[0074] The K-means clustering algorithm is a commonly used and simple clustering algorithm. It is an iterative clustering analysis algorithm that divides data into a specified number of k clusters, and the centroid of each cluster is calculated from the mean of the samples in each cluster. The idea behind this clustering algorithm is to continuously calculate the distance between each sample point and the cluster center until convergence. The specific steps are as follows:

[0075] (1) Randomly select k sample points from the keyword data as the original cluster centers.

[0076] (2) Calculate the distance between the remaining samples and the cluster centers, and label each sample as the category closest to the k cluster centers.

[0077] (3) Recalculate the mean of the sample points in each cluster and use the mean as the new k cluster centers.

[0078] (4) Repeat (2) and (3) until the changes in the cluster centers tend to stabilize, forming the final k clusters.

[0079] This embodiment uses the TF-IDF algorithm to extract keyword weights, constructing a matrix of work orders and feature words (reflecting the importance of a word in a work order). For a single work order, the top K words with the highest weights are extracted as the keywords for that work order. Finally, k-means clustering is used to group work orders with similar problems and needs into the same category (group), and to ensure that the problems and needs between different categories (groups) are as different as possible, thereby generating a classification of the problems and needs to be optimized for the product.

[0080] The method in this embodiment performs precise text analysis on the work order system, extracts key information, and categorizes and statistically analyzes the key information, enabling product managers to grasp the problems that urgently need to be solved and the direction for optimization.

[0081] Example 2

[0082] This embodiment provides a device for extracting and analyzing key information from a work order system text, comprising:

[0083] A custom module is used to perform statistical analysis on the work order information in the training set, and to define commonly used sentence templates and part-of-speech path templates; as well as to define stop words and a dedicated dictionary for invoice business for word segmentation.

[0084] The text segmentation module is used to traverse the text to be analyzed in the work order system, and use the common statement templates to perform statement template matching on the text to be analyzed to form text segments.

[0085] The keyword extraction module is used to segment the text segments and extract keyword information based on the part-of-speech path template, the stop words, and the invoice business-specific dictionary.

[0086] The statistical analysis module is used to perform word frequency statistics and cluster analysis on the extracted keyword information to generate a list of issues and requirements for product optimization.

[0087] In this embodiment, the method for defining the commonly used statement template through the custom module includes:

[0088] The training set of work orders is analyzed and classified to form multiple types of classified work orders;

[0089] Extract common key identifiers from each type of work order category to generate the commonly used statement template.

[0090] The types of work orders categorized include problem-based, transaction-based, and requirement-based. The format of the commonly used statement template is prefix + key identifier + suffix.

[0091] In this embodiment, the method for defining the part-of-speech path template through the custom module includes:

[0092] For the work order information in the training set, match it according to the commonly used statement template to generate prefix words and suffix words;

[0093] Statistical analysis of keyword part-of-speech paths is performed on the prefixes and suffixes to extract the keyword part-of-speech paths of the top 3 prefixes and suffixes, forming the part-of-speech path template.

[0094] The format of the text segmentation is prefix word + key identifier + suffix word, and the key identifier of the text segmentation is one of the key identifiers of the commonly used sentence template.

[0095] In this embodiment, the keyword extraction module is specifically used for:

[0096] The stop words and the invoice business-specific dictionary are used to segment the prefix and suffix words of the text segments;

[0097] Based on the part-of-speech path template of specific key identifiers, keywords are extracted from the prefix and suffix words of the text segments;

[0098] Repeat the above steps to extract keyword information from all text segments of the text to be analyzed.

[0099] In this embodiment, the statistical analysis module is specifically used for:

[0100] Key phrase frequency statistics are performed on all extracted keywords to generate a key phrase frequency graph for product analysis.

[0101] Extract TF-IDF weights from keyword information and perform K-means clustering analysis to generate a list of the top N types of problems and requirements to be optimized for the product, where N is a positive integer.

[0102] Example 3

[0103] This embodiment provides an electronic device, the electronic device comprising:

[0104] At least one processor; and,

[0105] A memory communicatively connected to the at least one processor; wherein,

[0106] The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method for extracting and analyzing key text information of the work order system as described in Embodiment 1.

[0107] An electronic device according to embodiments of the present disclosure includes a memory and a processor. The memory is used to store non-transitory computer-readable instructions. Specifically, the memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. The volatile memory may, for example, include random access memory (RAM) and / or cache memory. The non-volatile memory may, for example, include read-only memory (ROM), hard disk, flash memory, etc.

[0108] The processor may be a central processing unit (CPU) or other form of processing unit with data processing capabilities and / or instruction execution capabilities, and may control other components in the electronic device to perform desired functions. In one embodiment of this disclosure, the processor is used to execute computer-readable instructions stored in the memory.

[0109] Those skilled in the art will understand that, in order to solve the technical problem of how to achieve a good user experience, this embodiment may also include well-known structures such as communication buses and interfaces, and these well-known structures should also be included within the protection scope of this disclosure.

[0110] For a detailed description of this embodiment, please refer to the corresponding descriptions in the foregoing embodiments, which will not be repeated here.

[0111] The various embodiments of the present invention have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments.

Claims

1. A method for extracting and analyzing text key information of a work order system, characterized in that, include: Statistical analysis is performed on the work order information in the training set, and common statement templates and part-of-speech path templates are defined. Defining the common statement templates includes: analyzing and classifying the work orders in the training set to form multiple types of categorized work orders; extracting common key identifiers for each type of categorized work order to generate the common statement templates; the format of the common statement templates is prefix + key identifier + suffix. Defining the part-of-speech path templates includes: matching the work order information in the training set according to the common statement templates to generate prefixes and suffixes; performing keyword part-of-speech path statistical analysis on the prefixes and suffixes; extracting the keyword part-of-speech paths of the top 3 prefixes and suffixes to form part-of-speech path templates corresponding to the key identifiers. Define a stop word dictionary and a dedicated dictionary for invoice-related business, for use in word segmentation; The text to be analyzed in the work order system is traversed, and the text to be analyzed is matched with the commonly used statement template to form text segments; the format of the text segment is prefix word + key identifier + suffix word, and the key identifier of the text segment is one of the key identifiers of the commonly used statement template; Based on the part-of-speech path template, and in conjunction with the stop words and the invoice business-specific dictionary, the text segments are segmented and keyword information is extracted. This includes: segmenting the prefix and suffix words of the text segments using the stop words and the invoice business-specific dictionary; extracting keywords from the prefix and suffix words of the text segments based on the part-of-speech path template of specific key identifiers; and repeating the above steps to iteratively extract keyword information from all text segments of the text to be analyzed. The extracted keyword information is subjected to word frequency statistics and cluster analysis to generate a list of problems and requirements to be optimized for the product. This includes: performing key phrase word frequency statistics on all extracted keywords to form a key phrase word frequency graph for product analysis; extracting TF-IDF weights from the keyword information and performing K-means cluster analysis to generate a list of the top N types of problems and requirements to be optimized for the product, where N is a positive integer.

2. The method according to claim 1, characterized in that, The types of work orders categorized include problem type, transaction type, and requirement type.

3. A device for extracting and analyzing key information from a work order system text, characterized in that, include: A custom module is used to perform statistical analysis on the work order information in the training set, and to define commonly used sentence templates and part-of-speech path templates; The document also defines a stop word and an invoice-specific dictionary for word segmentation. The definition of the commonly used statement template includes: analyzing and classifying work orders in the training set to form multiple types of categorized work orders; extracting common key identifiers for each type of categorized work order to generate the commonly used statement template, the format of which is prefix + key identifier + suffix. The definition of the part-of-speech path template includes: matching the work order information in the training set according to the commonly used statement template to generate prefixes and suffixes; performing keyword part-of-speech path statistical analysis on the prefixes and suffixes; extracting the keyword part-of-speech paths of the top 3 prefixes and suffixes to form the part-of-speech path template corresponding to the key identifier. The text segmentation module is used to traverse the text to be analyzed in the work order system, and to perform sentence template matching on the text to be analyzed using the commonly used sentence templates to form text segments; the format of the text segment is prefix word + key identifier + suffix word, and the key identifier of the text segment is one of the key identifiers of the commonly used sentence templates; The keyword extraction module is used to segment the text segments and extract keyword information based on the part-of-speech path template, the stop words, and the invoice business-specific dictionary. This includes: segmenting the prefix and suffix words of the text segments using the stop words and the invoice business-specific dictionary; extracting keywords from the prefix and suffix words of the text segments based on the part-of-speech path template with specific key identifiers; and repeating the above steps to iteratively extract keyword information from all text segments of the text to be analyzed. The statistical analysis module is used to perform word frequency statistics and cluster analysis on the extracted keyword information to generate a list of problems and requirements to be optimized for the product. This includes: performing key phrase word frequency statistics on all extracted keywords to form a key phrase word frequency graph for product analysis; extracting TF-IDF weights from the keyword information and performing K-means cluster analysis to generate a list of the top N types of problems and requirements to be optimized for the product, where N is a positive integer.

4. An electronic device, characterized in that, The electronic device includes: At least one processor; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method for extracting and analyzing key text information of the work order system as described in claim 1 or 2.