Unstructured vehicle condition information extraction method and system based on natural language processing
By extrapolating unstructured vehicle condition information in the vehicle functional topology network, the problems of poor logical consistency and low information integrity in the prior art are solved, and high-quality vehicle condition information extraction is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING YUCHEXING INFORMATION TECHNOLOGY CO LTD
- Filing Date
- 2026-03-31
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies, when extracting information from multi-source unstructured vehicle condition text, lack utilization of the inherent relationships within the vehicle system, resulting in poor logical consistency and low information completeness in the extraction results.
By acquiring unstructured mixed text, key descriptive fragments are extracted using a pre-trained natural language processing model and mapped onto the vehicle functional topology network for inference. Logical inconsistencies between fragments are discovered. Based on this, auxiliary descriptive text is extracted from the original context and integrated with key descriptive fragments for reasoning to generate structured vehicle state metadata.
It effectively identifies and corrects logical contradictions between text fragments from different sources, improves the internal consistency and logical rationality of information, and solves the problem of incomplete and unreliable extraction results caused by vague descriptions or missing information in single texts.
Smart Images

Figure CN122197897A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of natural language processing and vehicle information processing technology, and in particular to a method and system for extracting unstructured vehicle condition information based on natural language processing. Background Technology
[0002] With the popularization of intelligent connected vehicles, the amount of multi-source heterogeneous unstructured text data generated during vehicle operation, such as vehicle logs, sensor reports, and maintenance dialogue records, is becoming increasingly complex. Automating and extracting structured vehicle condition information from these texts that can reflect the true health status of the vehicle and can be directly analyzed and utilized by subsequent systems is the core foundation for achieving predictive maintenance, accurate diagnosis, and vehicle value assessment.
[0003] Currently, to address the aforementioned technical needs, the adopted technical solution is to utilize a natural language processing model pre-trained on general corpora and some vehicle texts to directly identify entities and phrases mentioning vehicle parts, fault phenomena, and parameter indicators from the mixed text, and then combine these identification results according to predefined templates or rules to output structured records.
[0004] However, this method still has significant drawbacks. Essentially, it identifies and reconstructs information on the surface of text, and its processing is completely independent of the vehicle's internal working principle as a complex system. When text fragments from different sources contain contradictions at the system level, or when a single text fragment is incomplete due to ambiguity, the method cannot discover these contradictions based on the functional relationships of the vehicle system, nor can it actively and purposefully search for information from the original context that can supplement or correct the current understanding. This results in the structured record being generated potentially having internal logical conflicts or missing information, affecting the accuracy and reliability of vehicle condition judgment. Summary of the Invention
[0005] This application provides a method and system for extracting unstructured vehicle condition information based on natural language processing, which solves the problem in the prior art that the extraction results have poor logical consistency and low information completeness due to the lack of utilization of the inherent correlation of the vehicle system when extracting information from multi-source unstructured vehicle condition text.
[0006] Firstly, this application provides a method for extracting unstructured vehicle condition information based on natural language processing, including: Obtain the unstructured mixed text of the target vehicle; Using a pre-trained natural language processing model, key descriptive fragments describing the vehicle's state are extracted from the unstructured mixed text. The key description fragments are mapped onto the vehicle functional topology network, and the events described by the key description fragments are deduced to discover logical inconsistencies between the fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle. Based on the aforementioned logical inconsistency, relevant auxiliary descriptive text is extracted from the original context of the unstructured mixed text; By combining the key description fragments and the auxiliary description text, and performing integrated reasoning under the rule constraints of the vehicle functional topology network, vehicle state metadata is generated.
[0007] Optionally, a pre-trained natural language processing model is used to extract key descriptive fragments describing the vehicle's state from the unstructured mixed text, including: From the unstructured mixed text, text entries from the vehicle system log, parameter description text from sensor reports, and colloquial description text from the maintenance technician's records are identified; The text entries, parameter description text, and colloquial description text are input into the natural language processing model. The fragment containing vehicle control instructions is located in the text entries, the fragment containing a combination of numerical readings and measurement units is located in the parameter description text, and the fragment containing the name of the target vehicle component and a description of the abnormal phenomenon is located in the colloquial description text. Based on a pre-stored set of vehicle-related specialized terms, all fragments are filtered, and target fragments whose vocabulary matches the set of vehicle-related specialized terms are retained. Based on the context of the original text in which the target fragment is located, determine whether the target fragment independently describes the complete vehicle state or needs to be combined with adjacent text to describe the complete vehicle state; The target segment that is determined to need to be combined is merged with the adjacent text in the original text to form a merged description unit; All segments determined to be independent descriptions, along with the merged description units, are designated as key description segments.
[0008] Optionally, the key description fragments are mapped to a vehicle functional topology network, and the events described by the key description fragments are deduced to discover logical inconsistencies between fragments, including: From the vehicle function topology network, determine the network nodes corresponding to the vehicle component names mentioned in each of the key description segments; The vehicle state described by the key description fragment is converted into a state tag attached to the corresponding network node; In the vehicle functional topology network, starting from the network node with the attached status label, the vehicle status represented by the status label is transmitted to the downstream network node that has a direct connection relationship with the network node with the attached status label, according to the predefined connection line direction in the vehicle functional topology network. Based on the internal relationships of the vehicle system represented by the connection lines, determine the expected form of the vehicle state when it is transmitted to downstream network nodes, and record the expected form of the state as an expected state marker. Inspect the network nodes in the vehicle function topology network that have been directly associated with the key description fragment and marked with status tags; The status markers in the detected network nodes are compared with the expected status markers. If the comparison finds that they meet the preset conditions, it is determined that a logical inconsistency has been found.
[0009] Optionally, based on the logical inconsistency, relevant auxiliary descriptive text is extracted from the original context of the unstructured mixed text, including: Based on the logical inconsistency, target network nodes in the vehicle function topology network involved in the logical inconsistency are identified. Tracing back to the unstructured mixed text, the original text paragraphs originating from the vehicle component corresponding to the target network node were located and positioned when the key descriptive fragments were extracted; Based on the original text paragraph, the scope of the text paragraph is expanded to the adjacent text paragraphs before and after the original text paragraph to obtain an extended context text block containing the original text paragraph; From the extended context text block, filter out text sentences containing vehicle component names, status descriptors, and action descriptors related to the target network node; All text sentences are combined to form auxiliary descriptive text related to the aforementioned logical inconsistency.
[0010] Optionally, by combining the key description fragments and the auxiliary description text, and performing integrated reasoning under the rule constraints of the vehicle functional topology network, vehicle state metadata is generated, including: The auxiliary description text is input into the natural language processing model to extract the target auxiliary description fragment describing the vehicle state; The target auxiliary description fragment is mapped to the vehicle function topology network to determine the network node associated with the target auxiliary description fragment, and the vehicle state described by the target auxiliary description fragment is converted into an auxiliary state label attached to the corresponding network node; In the vehicle functional topology network, for each target network node identified due to logical inconsistency, the state marker directly marked by the key description fragment on the target network node is compared with all auxiliary state markers mapped to the target network node. Based on the comparison results, if the content of any auxiliary state marker reconciles the logical inconsistency, then the auxiliary state marker is used to update the original state marker on the target network node; if none of the auxiliary state markers can reconcile the logical inconsistency, then the record of the logical inconsistency and all related state markers are retained. After updating and recording all target network nodes, in the vehicle function topology network, starting from each network node marked with a status mark, the vehicle status represented by the updated status mark is transmitted to all downstream network nodes again and a status consistency check is performed according to the direction of the connection line. The system integrates all updated and checked node status markers in the vehicle functional topology network that remain consistent, as well as recorded unresolved logical inconsistencies, and organizes them into structured data according to a preset metadata format. This structured data serves as vehicle status metadata, which includes vehicle components, status descriptions, relationships, and consistency conclusions.
[0011] Optionally, based on the internal relationships of the vehicle system represented by the connection line, the expected manifestation of the vehicle state when transmitted to downstream network nodes is determined, and the expected manifestation is recorded as an expected state marker, including: Based on the type of connection line in the vehicle functional topology network that connects the network node with the attached state label to the downstream network node, the conversion rule from the vehicle state represented by the state label to the state of the downstream network node is determined. The type of connection line includes lines that represent direct causal relationships, lines that represent sequential signal transmission, and lines that represent functional dependencies. According to the determined conversion rules, the state description content contained in the state tag on the network node with the attached state tag is logically converted to obtain the converted state content corresponding to the downstream network node. The transformed state content is combined with information identifying the downstream network node and information identifying the starting network node and connection line on which the transformation is based to form an expected state label for the downstream network node.
[0012] Optionally, the target auxiliary description fragment is mapped to the vehicle function topology network to determine the network node associated with the target auxiliary description fragment, and the vehicle state described by the target auxiliary description fragment is converted into an auxiliary state label attached to the corresponding network node, including: Identify the names of vehicle components mentioned in the target auxiliary description fragment and the text content describing the status of the components; In the vehicle function topology network, find the network node that matches the name of the vehicle component and use it as the network node associated with the target auxiliary description fragment; The text content describing the state of the component is analyzed to determine the type and degree of vehicle state expressed therein; The determined vehicle state type and state degree description are combined into state description entries according to the preset formatting rules. The state description entry is bound to information identifying the network node associated with the state description entry to form an auxiliary state tag.
[0013] Secondly, this application provides a system for extracting unstructured vehicle condition information based on natural language processing, including: The acquisition module is used to acquire unstructured mixed text of the target vehicle; The first extraction module is used to extract key descriptive fragments describing the vehicle's state from the unstructured mixed text using a pre-trained natural language processing model. The deduction module is used to map the key description fragments to the vehicle functional topology network, and to deduce the events described by the key description fragments in order to discover logical inconsistencies between fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle. The second extraction module is used to extract relevant auxiliary descriptive text from the original context of the unstructured mixed text based on the logical inconsistency. An integration module is used to combine the key description fragments and the auxiliary description text, and perform integrated reasoning under the rule constraints of the vehicle functional topology network to generate vehicle state metadata.
[0014] Thirdly, this application provides a computing device, including a processing component and a storage component; the storage component stores one or more computer instructions; the one or more computer instructions are invoked and executed by the processing component to implement the method for extracting unstructured vehicle condition information based on natural language processing as described in the first aspect above.
[0015] Fourthly, this application provides a computer storage medium storing a computer program, which, when executed by a computer, implements a method for extracting unstructured vehicle condition information based on natural language processing as described in the first aspect.
[0016] This application maps extracted key descriptive fragments to a functional topology network reflecting the actual system connections of a vehicle, and infers descriptive events based on the rules of this network. This proactively identifies implicit logical contradictions between text fragments from different sources, effectively overcoming the problem of traditional solutions outputting contradictory information due to neglecting system relationships. Specifically, based on the identified logical inconsistencies, this application selectively extracts relevant information from the original text context, enabling subsequent information integration to specifically correct or supplement the initial extraction results, thereby improving the internal consistency and logical rationality of the extracted information.
[0017] Furthermore, after supplementing and extracting auxiliary text, it is transformed into a structured state label that can be precisely associated with the nodes of the topological network. This process enables the supplementary information mined from the context to participate again in the global consistency verification and reasoning based on system knowledge in a standardized form, thereby solving the problem of incomplete and unreliable extraction results caused by the ambiguity or lack of information in a single textual description.
[0018] These or other aspects of this application will become more apparent in the following description of the embodiments. Attached Figure Description
[0019] To more clearly illustrate the technical solutions in this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0020] Figure 1 A flowchart of a method for extracting unstructured vehicle condition information based on natural language processing, provided in this application, is shown. Figure 2 This paper presents a schematic diagram of the structure of an unstructured vehicle condition information extraction system based on natural language processing provided in this application. Figure 3 A schematic diagram of the structure of a computing device provided in this application is shown. Detailed Implementation
[0021] To enable those skilled in the art to better understand the present application, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings.
[0022] In some of the processes described in the specification, claims, and accompanying drawings of this application, multiple operations appearing in a specific order are included. However, it should be clearly understood that these operations may not be executed in the order they appear herein, or may be executed in parallel. The operation numbers, such as 101, 102, etc., are merely used to distinguish different operations and do not themselves represent any execution order. Furthermore, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descriptions such as "first," "second," etc., in this document are used to distinguish different messages, devices, modules, etc., and do not represent a chronological order, nor do they limit "first" and "second" to different types.
[0023] The technical solutions of this application will now be clearly and completely described with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0024] Figure 1 This application provides a flowchart of a method for extracting unstructured vehicle condition information based on natural language processing, such as... Figure 1 As shown, the method includes: Step 101: Obtain the unstructured mixed text of the target vehicle.
[0025] In this step, the target vehicle refers to the specific vehicle from which vehicle condition information needs to be extracted.
[0026] Unstructured mixed text refers to a collection of text data obtained from different sources without a unified fixed format. In the context of this application, it specifically includes three types: the first is vehicle system logs, which are record files automatically generated by the vehicle's internal electronic control unit during operation, typically containing timestamps, control command codes, system status codes, and brief result descriptions; the second is sensor report text, which is text information generated by various vehicle sensors, such as pressure sensors, containing numerical readings and corresponding status descriptions; and the third is speech-to-text by repair technicians, which is a text record converted from the fault phenomena and operating procedures dictated by repair technicians during vehicle inspection using speech recognition technology.
[0027] In this step, firstly, the vehicle system log files generated during the recent operation of the target vehicle are exported from its onboard system. Secondly, periodic or event-triggered reports generated by the various sensors of the vehicle and output in text format are obtained from the diagnostic tools or data platforms connected to the vehicle. Finally, the latest maintenance service records of the vehicle are retrieved from the maintenance management system, which include dialogues or oral notes recorded by technicians using recording equipment and converted into text by automatic speech recognition technology. The above three types of text data are summarized to obtain the required unstructured mixed text.
[0028] For example, taking vehicle A, which has an unstable idling fault, as an example, a log text is exported from vehicle A's vehicle system, which includes the entry time: 10:05:23, throttle opening command: 15%, actual opening: 12%, status: deviation exceeds limit; first, a text describing the intake manifold absolute pressure sensor reading: 65 kPa, below the standard range is obtained from vehicle A's engine sensor data report; second, a record of a shop visit for vehicle A is obtained from the maintenance record system, which includes a text dictated and converted by technician B, in which the owner reported that the engine vibrated significantly when waiting at a red light, and no fault codes were found during the initial inspection; finally, the three texts are merged to obtain the obtained unstructured mixed text.
[0029] Step 102: Using a pre-trained natural language processing model, extract key descriptive fragments describing the vehicle's state from the unstructured mixed text.
[0030] Optionally, step 102 may specifically include: Step 1021: Identify text entries from the vehicle system log, parameter description text from sensor reports, and colloquial description text from the maintenance technician's records from the unstructured mixed text.
[0031] Step 1022: Input the text entry, the parameter description text, and the colloquial description text into the natural language processing model. Locate the segment containing vehicle control instructions in the text entry, locate the segment containing a combination of numerical readings and measurement units in the parameter description text, and locate the segment containing the target vehicle component name and anomaly description in the colloquial description text.
[0032] Step 1023: Based on the pre-stored set of vehicle domain proper nouns, filter all fragments and retain the target fragments whose words match the set of vehicle domain proper nouns.
[0033] Step 1024: Based on the context of the original text in which the target fragment is located, determine whether the target fragment independently describes the complete vehicle state or needs to be combined with adjacent text to describe the complete vehicle state.
[0034] Step 1025: The target segment that is determined to need to be combined is merged with the adjacent preceding and following text in the original text to form a merged description unit.
[0035] Step 1026: All segments determined to be independent descriptions, as well as the merged description units, are designated as key description segments.
[0036] In this step, the pre-trained natural language processing model refers to a computer algorithm model that is trained on a large amount of text data to understand human language patterns and perform specific language tasks. In the scenario of this application, the model is specifically trained to identify entities and descriptions related to the technical state of the vehicle.
[0037] Key descriptive fragments refer to text units that are located and filtered from the original mixed text and can be used independently or in combination to characterize a specific condition of a vehicle component or system.
[0038] A text entry refers to a record with complete semantics that is identified from the vehicle's infotainment system log.
[0039] Parameter description text refers to the statements identified from sensor reports that contain specific measurement values and their physical units.
[0040] Colloquial descriptive text refers to narrative text with characteristics of everyday spoken language that is identified from the records of maintenance technicians.
[0041] Vehicle control commands refer to the command codes or keywords in the vehicle's infotainment system log that indicate the execution of a specific operation, such as adjusting the throttle opening or starting the fuel pump.
[0042] The combination of numerical readings and measurement units refers to specific data and their measurement labels, such as 65 kPa or 12%.
[0043] The target vehicle component name refers to the name of a component that has a clear designation in the vehicle field, such as the throttle body, intake manifold absolute pressure sensor, and engine.
[0044] The description of abnormal phenomena refers to words such as obvious shaking, deviation exceeding the limit, or below the standard range, indicating an abnormal state.
[0045] The vehicle-specific terminology collection is a pre-built vocabulary that includes standard names and common expressions for various vehicle subsystems, components, fault phenomena, and parameter indicators.
[0046] The target segment refers to a text segment containing vehicle-related words after initial location.
[0047] The context of the original text refers to the content of the sentences immediately preceding and following the target fragment in its respective text entry, parameter description text, or colloquial description text.
[0048] An independent description of a complete vehicle state means that the target segment itself clearly indicates which component is in what state.
[0049] The requirement that a complete vehicle state can only be described by combining it with adjacent text means that the meaning of the target segment is incomplete and needs to be combined with the words before and after it to form a complete state description.
[0050] The merged descriptive unit refers to a new, semantically complete text segment formed by merging the target segment to be combined with its immediate preceding and following text.
[0051] In this step, based on the source format characteristics of the text, such as the timestamp prefix of the log, the parameter table structure of the report, and the conversation markers of the spoken text, the acquired unstructured mixed text is automatically classified into three parts: text entries from the vehicle system log, parameter description text from the sensor report, and spoken description text from the maintenance technician's record. Next, the categorized texts are input into a natural language processing model. This model uses its internal sequence labeling capabilities to find fragments in text entries that contain keywords such as instructions and states followed by codes or descriptions; fragments in parameter description text that conform to the number + unit pattern; and fragments in colloquial description text that contain both component names such as engine and abnormal verbs or adjectives such as shaking or instability. Then, all the located fragments are compared with a pre-stored set of vehicle domain proper nouns, and only those fragments containing at least one word appearing in the proper noun set are retained. These retained fragments are the target fragments. Subsequently, for each target fragment, its context in the original text is analyzed. By determining whether the fragment contains a clear subject-verb-object structure or is separated by punctuation marks such as periods, it is determined whether it can independently describe the complete vehicle state or is semantically incomplete and needs to be combined with adjacent text. Then, for those target fragments that are determined to need to be combined, the preceding and following words in the original text are extracted and concatenated with the target fragment itself to form a more semantically complete merged description unit. Finally, the target fragments that are determined to be independently descriptive, as well as all the generated merged description units, are gathered together as the key description fragment output extracted from the unstructured mixed text for use in subsequent steps.
[0052] For example, following the specific implementation of the previous step, taking vehicle A as an example, the system first identifies the following text entries in the vehicle's system log: time: 10:05:23, throttle opening command: 15%, actual opening: 12%, status: deviation exceeds limit. It also identifies the intake manifold absolute pressure sensor reading: 65 kPa, below the standard range, which is a parameter description from the sensor report. Finally, it identifies the owner's report of noticeable engine vibration while waiting at red lights. Initial inspection revealed no fault codes belonging to the colloquial descriptions recorded by the mechanic. Next, the natural language processing model located two segments in the first text entry: throttle opening instruction: 15% and actual opening: 12%, status: deviation exceeding limits. In the second parameter description text, it located the segment "65 kPa, below the standard range." In the third colloquial description text, it located the segment "significant engine vibration." Then, a set of vehicle-specific terms containing words such as throttle, opening, deviation, intake manifold, pressure sensor, kPa, engine, and vibration was used for filtering. All four segments contained terms from this set and were therefore retained as target segments. Analyzing the context, it was determined that the throttle opening instruction "15%" was unclear on its own and needed to be combined with the following text. The actual opening "12%" and "State: Deviation Exceeds Limit" were descriptive enough on their own and were therefore deemed independently descriptive. The other two descriptive segments, "65 kPa, below standard range" and "significant engine vibration," were also deemed independently descriptive. The throttle opening instruction "15%" was then merged with the immediately following "Actual Opening "12%" and "State: Deviation Exceeds Limit," forming a new merged descriptive unit: "Throttle Opening Instruction "15%", Actual Opening "12%", State: Deviation Exceeds Limit." Finally, the independently descriptive segments "Actual Opening "12%", State: Deviation Exceeds Limit", "65 kPa, below standard range", and "significant engine vibration" were combined with the merged descriptive unit "Throttle Opening Instruction "15%", Actual Opening "12%", State: Deviation Exceeds Limit" as the extracted key descriptive segments.
[0053] In this step, targeted text mining combined with domain knowledge is used to initially extract descriptive units that are directly related to vehicle status and have relatively clear semantics from multi-source heterogeneous unstructured text, providing structured information input for subsequent in-depth analysis and reasoning based on system knowledge.
[0054] Step 103: Map the key description fragments to the vehicle functional topology network, and deduce the events described by the key description fragments to discover logical inconsistencies between fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle.
[0055] Optionally, step 103 may specifically include: Step 1031: Determine the network node corresponding to the vehicle component name mentioned in each of the key description segments from the vehicle functional topology network.
[0056] Step 1032: Convert the vehicle state described by the key description fragment into a state tag attached to the corresponding network node.
[0057] Step 1033: In the vehicle functional topology network, starting from the network node with the attached status label, the vehicle status represented by the status label is transmitted to the downstream network node that has a direct connection relationship with the network node with the attached status label, according to the predefined connection line direction in the vehicle functional topology network.
[0058] Step 1034: Based on the internal relationship of the vehicle system represented by the connection line, determine the expected form of the vehicle state when it is transmitted to the downstream network node, and record the expected form of the state as the expected state marker.
[0059] Optionally, step 1034 may specifically include the following steps: based on the type of connection line in the vehicle functional topology network connecting the network node with the attached state tag and the downstream network node, determine the conversion rule from the vehicle state represented by the state tag to the state of the downstream network node, wherein the type of connection line includes lines representing direct causal relationships, lines representing sequential signal transmission, and lines representing functional dependencies; according to the determined conversion rule, logically convert the state description content contained in the state tag on the network node with the attached state tag to obtain the converted state content corresponding to the downstream network node; combine the converted state content with the information identifying the downstream network node and the information identifying the starting network node and connection line on which the conversion is based to form the expected state tag for the downstream network node.
[0060] Step 1035: Check the network nodes in the vehicle function topology network that have been directly associated with the key description fragment and marked with status tags.
[0061] Step 1036: Compare the status markers in the detected network nodes with the expected status markers. If the comparison finds that they meet the preset conditions, it is determined that a logical inconsistency has been found.
[0062] In this step, the vehicle functional topology network is a directed graph structure used to characterize the interaction relationships between various subsystems and components inside the vehicle. In the scenario of this application, the nodes of the network represent specific vehicle components, and the directed edges between nodes, i.e., the connecting lines, represent the signal flow, functional dependence, or causal relationship between components.
[0063] A network node is a vertex in the vehicle functional topology network that represents a specific vehicle component, such as the throttle valve, intake pressure sensor, or engine control unit.
[0064] The direction of a connection line refers to the direction in which a directed edge points, representing the direction of signal, influence, or dependency transmission. For example, an edge pointing from the throttle node to the engine node indicates that the throttle state will affect the engine state.
[0065] A direct connection refers to a direct connection between two network nodes in a vehicle functional topology network via a directed edge.
[0066] Downstream network nodes refer to nodes that receive signals or influences from another node along the direction of the connection line.
[0067] A status tag is a structured data object used to represent the status of a vehicle attached to a network node. Its content is derived from the corresponding key descriptive fragments.
[0068] The expected performance form refers to the state description that a downstream network node should theoretically exhibit when a network node is marked with a certain vehicle state, according to the principles of the vehicle system, as this state is transmitted along the connection line to its downstream network node.
[0069] An expected state label is a structured data object used to record the expected behavior of a downstream network node. Its content includes the expected state description, target node information, and the source of the expectation.
[0070] Logical inconsistency refers to a situation in a vehicle functional topology network where the state label on the same network node, which is directly converted from a key description fragment, contradicts the expected state label transmitted to that node through network deduction in terms of state description content.
[0071] The type of connection line refers to the classification of different interaction relationships between vehicle components. A line that represents a direct causal relationship means that the state of one node is the cause of the state of another node. A line that represents sequential signal transmission means that the signal output of one node is the input of another node. A line that represents functional dependence means that the normal operation of one node requires the other node to be in a specific state.
[0072] The transformation rule is a set of predefined logical mapping relationships used to describe how, when the state tag of an upstream node is A, the state of a downstream node should be transformed to content B after passing through a specific type of connection.
[0073] The transformed state content refers to the state description text for downstream nodes obtained after applying transformation rules to the state description content of upstream nodes.
[0074] The preset conditions refer to the conditions in this application where, for the same network node, the state label directly marked by the key description fragment contradicts the expected state label obtained through transmission.
[0075] In this step, all the obtained key description fragments are first traversed, and the names of vehicle components mentioned in each fragment are extracted. Then, network nodes with completely identical names are searched in the pre-built vehicle function topology network to establish a mapping relationship between key description fragments and network nodes. Next, for each key description fragment that has been mapped to a network node, its text content is parsed, and verb phrases or adjective phrases describing the state are extracted. These are then structured according to the format of component: state to form the state label on the network node. Then, in the vehicle function topology network, starting from each network node that has been marked with a state tag, all directed connection lines originating from that node and pointing to other nodes are identified. Along the direction of these connection lines, the state represented by the state tag on the current node is regarded as an event to be transmitted and prepared to be transmitted to the direct downstream network nodes pointed to by these lines. Subsequently, for each connection line from the marked node to its downstream node, according to the type of the connection line predefined in the topology network, the corresponding transformation rule is called to logically deduce the state description content in the state tag of the upstream node, so as to obtain the state description that the downstream node should theoretically exhibit, i.e., the transformed state content. Then, this content, the identifier of the downstream node, and the identifiers of the upstream node and connection line on which this transformation is based are packaged to form a complete expected state tag and recorded. After completing the transmission of all states and the generation of expected state tags, the entire vehicle functional topology network is scanned again to find all network nodes that are directly associated with key description segments and have attached state tags. Finally, for each such network node found, the actual state tag on the node, which is directly generated by the key description segment, is compared one by one with the state descriptions contained in all expected state tags transmitted to the node. When an expected state tag is found to have a state that directly conflicts with the state described by the existing state tag on the node in a logical way, such as one description working normally while the other description is functionally ineffective, a logical inconsistency is determined to have been found, and the inconsistent nodes and contradictory state contents involved are recorded.
[0076] For example, following the specific implementation of the previous step, taking vehicle A as an example, its vehicle functional topology network includes nodes such as throttle, intake manifold absolute pressure sensor, and engine. There exists a functionally dependent path from the throttle to the engine, and a sequential signal transmission path from the intake manifold absolute pressure sensor to the engine. Based on the extracted key description fragments, firstly, the throttle opening command: 15%, actual opening: 12%, and status: deviation exceeds the limit are mapped to the throttle node, and its status is converted into a status mark of throttle: opening deviation exceeds the limit. 65 kPa, below the standard range, is mapped to the intake manifold absolute pressure sensor node, and converted into a status mark of intake manifold absolute pressure sensor: reading below the standard range. Significant engine vibration is mapped to the engine node, and converted into a status mark of engine: significant vibration. Next, starting from the throttle node, its state is transmitted to the engine node along the function-dependent path. Based on the corresponding rules, the expected state of potential instability is deduced and recorded as the expected state mark. Simultaneously, starting from the intake manifold absolute pressure sensor node, its state is transmitted to the engine node along the sequential signal transmission path. The expected state of insufficient power and potential vibration is deduced and also recorded as the expected state mark. Subsequently, it is found that the engine node has been directly marked as having significant vibration. Finally, this direct state mark is compared with the two expected state marks transmitted from it. The semantics of significant vibration and potential vibration are consistent, and no direct contradiction is found. However, this process demonstrates how to compare states through deduction. If the direct state mark is stable operation, it will directly contradict the expectation of potential vibration, thus determining and discovering a logical inconsistency.
[0077] This step, by placing the extracted text results into a structured network of vehicle system knowledge for deduction and verification, can proactively discover potential inherent contradictions between descriptive information from different data sources, thereby identifying unreliable or unverifiable parts in the initial extraction results and improving the input quality of subsequent reasoning processes.
[0078] Step 104: Based on the logical inconsistency, extract relevant auxiliary descriptive text from the original context of the unstructured mixed text.
[0079] Optionally, step 104 may specifically include: Step 1041: Based on the logical inconsistency, determine the target network node in the vehicle functional topology network involved in the logical inconsistency.
[0080] Step 1042: Backtrack to the unstructured mixed text, find and locate the original text paragraphs that originated from the vehicle components corresponding to the target network nodes when extracting the key description fragments.
[0081] Step 1043: Based on the original text paragraph, extend to the adjacent text range before and after the original text paragraph to obtain an extended context text block containing the original text paragraph.
[0082] Step 1044: From the extended context text block, filter out text sentences containing vehicle component names, status descriptors, and action descriptors related to the target network node.
[0083] Step 1045: Combine all text sentences to form auxiliary descriptive text related to the logical inconsistency.
[0084] In this step, logical inconsistency refers to the discrepancies in state descriptions found within the vehicle functional topology network.
[0085] The target network node refers to one or more component nodes in the vehicle functional topology network that specifically involve the logical inconsistency, such as the node whose status label conflicts with the expected status label.
[0086] The original text segment refers to the portion of the original text that is used when extracting key descriptive fragments, and that originates from the vehicle component corresponding to the target network node.
[0087] An extended contextual text block refers to a larger text region with a more complete semantic background, obtained by extending a certain number of characters or sentences before and after the original text paragraph.
[0088] A text sentence refers to a language unit with an independent subject-predicate structure that is separated from an extended context text block by punctuation marks such as periods, question marks, and exclamation marks.
[0089] Supplementary descriptive text refers to one or more paragraphs of text that are ultimately combined to provide additional information for subsequent handling of logical inconsistencies.
[0090] In this step, the output logical inconsistency records are first parsed to extract all nodes in the vehicle function topology network directly involved in the contradictions. These nodes are then identified as the target network nodes that need to be focused on in this step. Next, based on the established correspondence between key description fragments and the original text, the specific location and content of the unstructured mixed text from which the key description fragments mapped to the target network nodes originated are searched in reverse to locate the corresponding original text paragraphs. Then, using each located original text paragraph as an anchor point, the original unstructured mixed text is scanned forward and backward, and text containing a predetermined number of characters or sentences is extracted. The anchor paragraphs are merged with these extended texts before and after to form a larger extended contextual text block containing a broader context. This is done to capture relevant descriptions that may have been overlooked in the initial extraction. Subsequently, for each extended context text block, sentence segmentation technology is used to split it into independent text sentences. Then, using a filtering word list containing nouns related to vehicle components, state adjectives, and action verbs, keyword matching is performed on each text sentence, retaining only those text sentences containing at least one word related to the target network node. Finally, all the filtered text sentences from different extended context text blocks are concatenated according to their order of appearance in the original text to form a coherent auxiliary descriptive text related to the logical inconsistency. This text will be used as supplementary information input into subsequent integration reasoning steps.
[0091] For example, following the specific implementation of the previous step, in the scenario of vehicle A, suppose a logical inconsistency is found: the direct state of the engine node is marked as running smoothly, but the expected state derived from the intake manifold absolute pressure sensor node is insufficient power and possible vibration, which is contradictory; firstly, based on this inconsistency, the target network nodes involved are identified as the engine and the intake manifold absolute pressure sensor; then, backtracking to the initial unstructured mixed text, the original text paragraphs on which the key descriptive fragments corresponding to these nodes are based are located, such as finding the owner's verbal record describing that the engine runs smoothly after recent high-speed driving, and the original sentence in the sensor report; then, based on these two original paragraphs, the scope of the text before and after them is expanded to obtain extended contextual text blocks containing more complete semantics, such as including the entire repair record containing the owner's description, and also including other parameter descriptions near the original sentence in the sensor report; Then, from these expanded text blocks, all complete sentences containing component names, status, or action words related to the target node are filtered out. For example, sentences such as "The owner claims that the engine runs smoothly after recent high-speed driving" and "The intake manifold absolute pressure sensor reading remains low when the engine is warm and idling" are filtered out. Finally, these filtered sentences are combined in their original order to form a coherent auxiliary descriptive text, namely, "The owner claims that the engine runs smoothly after recent high-speed driving" and "The intake manifold absolute pressure sensor reading remains low when the engine is warm and idling," which is used for subsequent analysis.
[0092] When this step initially extracts and deduces contradictions in the information, it can automatically and accurately trace back to the most original and complete data source, collect a wider range of contextual information around the point of contradiction, and provide key factual evidence to supplement the final more accurate and reasonable judgment.
[0093] Step 105: Combining the key description fragments and the auxiliary description text, and performing integrated reasoning under the rule constraints of the vehicle functional topology network, to generate vehicle state metadata.
[0094] Optionally, step 105 may specifically include: Step 1051: Input the auxiliary description text into the natural language processing model to extract the target auxiliary description fragment describing the vehicle state.
[0095] Step 1052: Map the target auxiliary description fragment to the vehicle function topology network to determine the network node associated with the target auxiliary description fragment, and convert the vehicle state described by the target auxiliary description fragment into an auxiliary state label attached to the corresponding network node.
[0096] Optionally, step 1052 may specifically include the following steps: identifying the vehicle component names mentioned in the target auxiliary description fragment and the text content describing the component status; in the vehicle functional topology network, finding network nodes that match the vehicle component names as network nodes associated with the target auxiliary description fragment; analyzing the text content describing the component status to determine the vehicle status type and status degree description expressed therein; combining the determined vehicle status type and status degree description into status description entries according to preset formatting rules; binding the status description entries with the information identifying the network nodes associated with the status description entries to form auxiliary status markers.
[0097] Step 1053: In the vehicle functional topology network, for each target network node identified due to logical inconsistency, the state marker directly marked by the key description fragment on the target network node is compared with all auxiliary state markers mapped to the target network node.
[0098] Step 1054: Based on the comparison results, if the content of any auxiliary state marker reconciles the logical inconsistency, then the auxiliary state marker is used to update the original state marker on the target network node. If none of the auxiliary state markers can reconcile the logical inconsistency, then the record of the logical inconsistency and all related state markers are retained.
[0099] Step 1055: After updating and recording all target network nodes, in the vehicle functional topology network, starting from each network node marked with a status marker, the vehicle status represented by the updated status marker is transmitted to all downstream network nodes again and a status consistency check is performed according to the direction of the connection line.
[0100] Step 1056: Integrate all node status markers in the vehicle functional topology network that have been updated and checked to maintain consistency, as well as the recorded unresolved logical inconsistencies, and organize them into structured data according to a preset metadata format as vehicle status metadata. The vehicle status metadata includes vehicle components, status descriptions, relationships, and consistency conclusions.
[0101] In this step, the target auxiliary description fragment refers to the relatively clear text unit that describes the state of vehicle components, extracted from the auxiliary description text through further processing.
[0102] Auxiliary state labels are a type of structured data object similar to state labels, but specifically refer to state representations derived from target auxiliary description fragments and attached to network nodes.
[0103] A state description entry refers to a data entry obtained by analyzing the auxiliary description fragments of the target and reorganizing the state information in a unified format, such as state type: degree description.
[0104] Reconciling logical inconsistencies refers to the ability of auxiliary state labels to explain or eliminate logical contradictions between key descriptive fragment state labels and expected state labels.
[0105] The transmission and state consistency check refers to resimulating the propagation of state in the vehicle functional topology network after updating the state tags of some nodes, and checking whether there are still new, unresolved logical contradictions among all state tags in the network.
[0106] Vehicle status metadata is the final output, structured collection of information. It records, in a standardized format, the vehicle component status confirmed after integration and reasoning, the relationships between components, and the final conclusion on the consistency of the entire information.
[0107] In this step, the generated auxiliary description text is first input into the same natural language processing model. This model uses rules similar to, but possibly more lenient than, those for extracting key description fragments to locate and extract all phrases that mention vehicle components and their states. These phrases are the target auxiliary description fragments. Next, for each target auxiliary description fragment, the name of the vehicle component it explicitly mentions and the text content describing the specific state of that component are identified. Then, network nodes in the vehicle function topology network that are completely consistent with the name of that component are searched and used as the nodes associated with that fragment. Subsequently, the state description text is analyzed to decompose the core state types expressed therein, such as jitter, low reading, and the modifications or degree descriptions of that state, such as obvious, persistently low. Finally, the state type and degree description are combined into a state description entry according to the preset format of type:degree, and this entry is bound to the identification information of the associated network node to generate an auxiliary state tag. Then, for all target network nodes identified due to logical inconsistencies, the state markers directly generated from key description fragments on these nodes are found in the vehicle functional topology network. At the same time, all auxiliary state markers mapped to the same node are identified, and a detailed semantic comparison is performed between the content of the direct state marker and the content of each auxiliary state marker. Subsequently, a decision is made based on the comparison results. If the content of any auxiliary state marker can reasonably explain or bridge the contradiction between the direct state marker and the previously expected state marker, such as if the auxiliary state marker provides key conditional information that makes both sides of the contradiction valid under specific premises, then this auxiliary state marker is used to update, replace, or supplement the original state marker directly generated from the key description fragment on the target network node. If the content of all auxiliary state markers cannot reconcile this contradiction, then the original record of this logical inconsistency is retained, and all related state markers are saved in association. After completing the state updates or conflict recordings for all target network nodes, in the entire vehicle functional topology network, starting from each network node currently marked with a state tag, whether updated or original, these latest states are again transmitted to all downstream network nodes according to the direction of the network connection lines. This process is repeated to perform a new round of state deduction and consistency checks, ensuring that all states in the network remain logically consistent after the update, and promptly identifying and recording any new inconsistencies that may be caused by the update. Finally, after multiple rounds of transmission and checks, all logically consistent nodes in the vehicle functional topology network and their final state tags are integrated, along with any recorded unresolved logical inconsistencies. Following a preset metadata format containing fixed fields, the component names, final state descriptions, inter-component association paths, and overall consistency conclusions (e.g., consistent, partial conflicts to be verified) are organized to generate the final structured data output, namely, the vehicle state metadata.
[0108] For example, following the specific implementation of the previous step, addressing the logical inconsistency in vehicle A—namely, the key descriptive fragment indicating smooth engine operation while anticipating potential engine vibration—auxiliary descriptive text is provided. First, this text is input into the model, extracting two target auxiliary descriptive fragments: smooth engine operation after high-speed driving and persistently low sensor readings during warm-up idling. Then, these two fragments are mapped to the engine node and intake manifold absolute pressure sensor node in the vehicle functional topology network, respectively, and their described states are converted into structured auxiliary state labels. Next, in the topology network, the original smooth operation state label on the engine node is compared with the new auxiliary state label mapped to that node; The comparison revealed that the auxiliary markers not only confirmed the stable operating state, but more importantly, supplemented the crucial operating condition after high-speed driving. The previously contradictory expected state, which might have caused vibration, was derived from sensor data during warm-up idling. Therefore, the auxiliary information revealed that the contradiction stemmed from the description of different operating conditions, thus reconciling the logical inconsistencies. Consequently, the auxiliary state markers supplemented with operating condition information were used to update the state description of the engine node. Subsequently, a consistency deduction and check were performed again based on the updated network state. Finally, all confirmed state information in the network was integrated to generate structured vehicle state metadata, which clearly records the state of each component and its corresponding operating condition, and provides a logically consistent conclusion under the premise of distinguishing operating conditions.
[0109] This step, by introducing supplementary contextual information and making decisions and global verification within the framework of the system's knowledge network, can effectively reconcile or clearly record information contradictions in the initial extraction, ultimately outputting a logically consistent, information-complete, and clearly labeled structured vehicle condition report, thus improving the accuracy and reliability of extracting vehicle condition information from unstructured text.
[0110] Figure 2 This application provides a schematic diagram of the structure of an unstructured vehicle condition information extraction system based on natural language processing, as shown below. Figure 2 As shown, the system includes: Module 21 is used to acquire unstructured mixed text of the target vehicle; The first extraction module 22 is used to extract key descriptive fragments describing the vehicle's state from the unstructured mixed text using a pre-trained natural language processing model. The deduction module 23 is used to map the key description fragments to the vehicle functional topology network, and to deduce the events described by the key description fragments in order to discover logical inconsistencies between fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle. The second extraction module 24 is used to extract relevant auxiliary descriptive text from the original context of the unstructured mixed text based on the logical inconsistency. Integration module 25 is used to combine the key description fragments and the auxiliary description text, and perform integrated reasoning under the rule constraints of the vehicle functional topology network to generate vehicle state metadata.
[0111] Figure 2 The aforementioned unstructured vehicle condition information extraction system based on natural language processing can perform... Figure 1 The implementation principle and technical effects of the unstructured vehicle condition information extraction method based on natural language processing described in the illustrated embodiment will not be repeated here. The specific methods by which each module and unit of the unstructured vehicle condition information extraction system based on natural language processing in the above embodiments are described in detail in the embodiments related to this method, and will not be elaborated upon here.
[0112] In one possible design, Figure 2 The unstructured vehicle condition information extraction system based on natural language processing shown in the embodiment can be implemented as a computing device, such as... Figure 3 As shown, the computing device may include a storage component 31 and a processing component 32; The storage component 31 stores one or more computer instructions, wherein the one or more computer instructions are invoked and executed by the processing component 32.
[0113] The processing component 32 is used for the above Figure 1 The embodiment describes a method for extracting unstructured vehicle condition information based on natural language processing.
[0114] The processing component 32 may include one or more processors to execute computer instructions to complete all or part of the steps in the above-described method. Alternatively, the processing component may be implemented as one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the above-described method.
[0115] Storage component 31 is configured to store various types of data to support operations at the terminal. The storage component can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0116] Of course, computing devices may also include other components, such as input / output interfaces, display components, communication components, etc.
[0117] Input / output interfaces provide interfaces between processing components and peripheral interface modules, which can be output devices, input devices, etc.
[0118] The communication components are configured to facilitate wired or wireless communication between computing devices and other devices.
[0119] The computing device can be a physical device or an elastic computing host provided by a cloud computing platform. In this case, the computing device can refer to a cloud server, and the aforementioned processing components, storage components, etc., can be basic server resources rented or purchased from the cloud computing platform.
[0120] This application also provides a computer storage medium storing a computer program, which, when executed by a computer, can perform the above-described functions. Figure 1 The embodiment shown is a method for extracting unstructured vehicle condition information based on natural language processing.
[0121] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0122] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.
[0123] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.
[0124] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A method for extracting unstructured vehicle condition information based on natural language processing, characterized in that, include: Obtain the unstructured mixed text of the target vehicle; Using a pre-trained natural language processing model, key descriptive fragments describing the vehicle's state are extracted from the unstructured mixed text. The key description fragments are mapped to the vehicle functional topology network, and the events described by the key description fragments are deduced to discover logical inconsistencies between fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle. Based on the aforementioned logical inconsistency, relevant auxiliary descriptive text is extracted from the original context of the unstructured mixed text; By combining the key description fragments and the auxiliary description text, and performing integrated reasoning under the rule constraints of the vehicle functional topology network, vehicle state metadata is generated.
2. The method according to claim 1, characterized in that, Using a pre-trained natural language processing model, key descriptive fragments describing the vehicle's state are extracted from the unstructured mixed text, including: From the unstructured mixed text, text entries from the vehicle system log, parameter description text from sensor reports, and colloquial description text from the maintenance technician's records are identified; The text entries, parameter description text, and colloquial description text are input into the natural language processing model. The fragment containing vehicle control instructions is located in the text entries, the fragment containing a combination of numerical readings and measurement units is located in the parameter description text, and the fragment containing the name of the target vehicle component and a description of the abnormal phenomenon is located in the colloquial description text. Based on a pre-stored set of vehicle-related specialized terms, all fragments are filtered to retain target fragments whose vocabulary matches the set of vehicle-related specialized terms. Based on the context of the original text in which the target fragment is located, determine whether the target fragment independently describes the complete vehicle state or needs to be combined with adjacent text to describe the complete vehicle state; The target segment that is determined to need to be combined is merged with the adjacent text in the original text to form a merged description unit; All segments determined to be independent descriptions, along with the merged description units, are designated as key description segments.
3. The method according to claim 1, characterized in that, The key description fragments are mapped onto the vehicle functional topology network, and the events described by the key description fragments are extrapolated to discover logical inconsistencies between the fragments, including: From the vehicle function topology network, determine the network nodes corresponding to the vehicle component names mentioned in each of the key description segments; The vehicle state described by the key description fragment is converted into a state tag attached to the corresponding network node; In the vehicle functional topology network, starting from the network node with the attached status label, the vehicle status represented by the status label is transmitted to the downstream network node that has a direct connection relationship with the network node with the attached status label, according to the predefined connection line direction in the vehicle functional topology network. Based on the internal relationships of the vehicle system represented by the connection lines, determine the expected form of the vehicle state when it is transmitted to downstream network nodes, and record the expected form of the state as an expected state marker. Inspect the network nodes in the vehicle function topology network that have been directly associated with the key description fragment and marked with status tags; The status markers in the detected network nodes are compared with the expected status markers. If the comparison finds that they meet the preset conditions, it is determined that a logical inconsistency has been found.
4. The method according to claim 1, characterized in that, Based on the aforementioned logical inconsistency, relevant auxiliary descriptive text is extracted from the original context of the unstructured mixed text, including: Based on the logical inconsistency, target network nodes in the vehicle function topology network involved in the logical inconsistency are identified. Tracing back to the unstructured mixed text, the original text paragraphs originating from the vehicle component corresponding to the target network node were located and positioned when the key descriptive fragments were extracted; Based on the original text paragraph, the scope of the text paragraph is expanded to the adjacent text paragraphs before and after the original text paragraph to obtain an extended context text block containing the original text paragraph; From the extended context text block, filter out text sentences containing vehicle component names, status descriptors, and action descriptors related to the target network node; All text sentences are combined to form auxiliary descriptive text related to the aforementioned logical inconsistency.
5. The method according to claim 1, characterized in that, Combining the key description fragments and the auxiliary description text, and performing integrated reasoning under the rule constraints of the vehicle functional topology network, vehicle state metadata is generated, including: The auxiliary description text is input into the natural language processing model to extract the target auxiliary description fragment describing the vehicle state; The target auxiliary description fragment is mapped to the vehicle function topology network to determine the network node associated with the target auxiliary description fragment, and the vehicle state described by the target auxiliary description fragment is converted into an auxiliary state label attached to the corresponding network node; In the vehicle functional topology network, for each target network node identified due to logical inconsistency, the state marker directly marked by the key description fragment on the target network node is compared with all auxiliary state markers mapped to the target network node. Based on the comparison results, if the content of any auxiliary state marker reconciles the logical inconsistency, then the auxiliary state marker is used to update the original state marker on the target network node; if none of the auxiliary state markers can reconcile the logical inconsistency, then the record of the logical inconsistency and all related state markers are retained. After updating and recording all target network nodes, in the vehicle function topology network, starting from each network node marked with a status mark, the vehicle status represented by the updated status mark is transmitted to all downstream network nodes again and a status consistency check is performed according to the direction of the connection line. The system integrates all updated and checked node status markers in the vehicle functional topology network that remain consistent, as well as recorded unresolved logical inconsistencies, and organizes them into structured data according to a preset metadata format. This structured data serves as vehicle status metadata, which includes vehicle components, status descriptions, relationships, and consistency conclusions.
6. The method according to claim 3, characterized in that, Based on the internal relationships of the vehicle system represented by the connection lines, determine the expected manifestation of the vehicle state when it is transmitted to downstream network nodes, and record the expected manifestation as an expected state marker, including: Based on the type of connection line in the vehicle functional topology network that connects the network node with the attached state label to the downstream network node, the conversion rule from the vehicle state represented by the state label to the state of the downstream network node is determined. The type of connection line includes lines that represent direct causal relationships, lines that represent sequential signal transmission, and lines that represent functional dependencies. According to the determined conversion rules, the state description content contained in the state tag on the network node with the attached state tag is logically converted to obtain the converted state content corresponding to the downstream network node. The transformed state content is combined with information identifying the downstream network node and information identifying the starting network node and connection line on which the transformation is based to form an expected state label for the downstream network node.
7. The method according to claim 5, characterized in that, Mapping the target auxiliary description fragment to the vehicle functional topology network to determine the network nodes associated with the target auxiliary description fragment, and converting the vehicle state described by the target auxiliary description fragment into auxiliary state tags attached to the corresponding network nodes, including: Identify the names of vehicle components mentioned in the target auxiliary description fragment and the text content describing the status of the components; In the vehicle function topology network, find the network node that matches the name of the vehicle component and use it as the network node associated with the target auxiliary description fragment; The text content describing the state of the component is analyzed to determine the type and degree of vehicle state expressed therein; The determined vehicle state type and state degree description are combined into state description entries according to the preset formatting rules. The state description entry is bound to information identifying the network node associated with the state description entry to form an auxiliary state tag.
8. A system for extracting unstructured vehicle condition information based on natural language processing, characterized in that, include: The acquisition module is used to acquire unstructured mixed text of the target vehicle; The first extraction module is used to extract key descriptive fragments describing the vehicle's state from the unstructured mixed text using a pre-trained natural language processing model. The deduction module is used to map the key description fragments to the vehicle functional topology network, and to deduce the events described by the key description fragments in order to discover logical inconsistencies between fragments. The vehicle functional topology network is used to reflect the connections and dependencies between various subsystems of the vehicle. The second extraction module is used to extract relevant auxiliary descriptive text from the original context of the unstructured mixed text based on the logical inconsistency. An integration module is used to combine the key description fragments and the auxiliary description text, and perform integrated reasoning under the rule constraints of the vehicle functional topology network to generate vehicle state metadata.
9. A computing device, characterized in that, It includes a processing component and a storage component; the storage component stores one or more computer instructions; the one or more computer instructions are invoked and executed by the processing component to implement the method for extracting unstructured vehicle condition information based on natural language processing as described in any one of claims 1 to 7.
10. A computer storage medium, characterized in that, The system contains a computer program that, when executed by a computer, implements a method for extracting unstructured vehicle condition information based on natural language processing as described in any one of claims 1 to 7.