A dual risk public opinion screening matching method and system
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- JIANGSU JINNONG
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-26
Smart Images

Figure CN122287641A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of screening and matching technology, specifically a dual-risk public opinion screening and matching method and system. Background Technology
[0002] With the rapid development of internet and mobile communication technologies, the speed and scope of public opinion information dissemination are constantly expanding. News media, social platforms, short video applications, and other diverse channels have become important sources of information for the public. The resulting large-scale public opinion data exhibits characteristics such as multi-source heterogeneity, high timeliness, and semantic complexity. Especially in the financial market sector, the identification and screening of public opinion risks has become an urgent technical problem to be solved.
[0003] Existing public opinion screening methods can be mainly divided into two categories: keyword or feature word-based retrieval methods, which usually rely on pre-set sensitive word databases or risk word lists to match and statistically analyze public opinion data; and semantic recognition methods based on large artificial intelligence models, which utilize the semantic understanding and generation capabilities of large models to perform global analysis of public opinion data, thereby identifying potential risks.
[0004] Public opinion risks exhibit significant temporal evolution and multidimensional interweaving. The same event may involve different sensitive words at different stages, and its risk level shows a trend of branching or aggregation over time. If screening methods cannot track these changes in a timely manner, it is difficult to form an effective early warning of potential risks. Existing technologies generally lack targeted screening and matching based on the evolution trajectory of public opinion, dynamic reconstruction of feature words, and semantic-feature fusion. Summary of the Invention
[0005] To address the shortcomings of existing technologies, this invention proposes a dual-risk public opinion screening and matching method and system. It utilizes the global semantic understanding capabilities of a large model and combines the dynamic weight changes of feature words in the process of public opinion evolution. Through cross-validation via dual risk channels, it achieves high-precision screening and matching of complex public opinion risks.
[0006] To achieve the above objectives, the present invention provides the following technical solution:
[0007] A dual-risk public opinion screening and matching method includes:
[0008] Collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and constructing correlation links on the collected public opinion data;
[0009] Based on the aforementioned public opinion corpus, a set of risk feature words is dynamically generated, and a risk weight matrix of risk feature words is constructed according to the intensity of event evolution.
[0010] The aforementioned public opinion corpus is input into a pre-trained language model. Using the feature word risk weight matrix as a mandatory constraint, a cross-level fusion matching of global semantic representation and local feature weights is performed to obtain the first risk screening result.
[0011] The first risk screening result is verified a second time. The dynamic confidence score based on the evolution trajectory of risk feature words is calculated and cross-corrected with the first risk screening result to obtain the final public opinion screening matching output.
[0012] Specifically, the construction of the public opinion corpus includes:
[0013] Timestamp correction is performed on the collected public opinion data;
[0014] Based on the implicit correlations in the corrected public opinion data, event indication fragments are generated, and event indication fragments belonging to the same event clue are aggregated into an initial event set.
[0015] A semantic connection index is introduced into the initial set of events to connect adjacent event fragments in context and generate an event semantic path containing causal chains;
[0016] The event semantic paths are sorted according to their chronological order and semantic coherence to obtain the sorted event semantic paths.
[0017] Specifically, based on the aforementioned public opinion corpus, a set of risk feature words is dynamically generated, and a risk weight matrix for the risk feature words is constructed according to the intensity of event evolution, including:
[0018] Automatically extract a set of semantic fragments from the sorted event semantic path, and perform hierarchical clustering in different time windows to generate multi-level semantic clusters;
[0019] Within the multi-level semantic cluster, candidate feature word sets are gradually selected by iteratively comparing semantic similarity and semantic salience, and then mapped to the corresponding event semantic path;
[0020] Based on the candidate feature word set, a feature word evolution chain across time periods is constructed, and the basic weights of the feature words are dynamically adjusted according to the extension strength and bifurcation frequency of the evolution chain to obtain dynamic basic weights;
[0021] The dynamic basic weights are superimposed with the cross-event coupling parameters to form a risk feature word weight matrix.
[0022] Specifically, the dynamic basic weights are superimposed with the cross-event coupling parameters to form a risk feature word weight matrix, including:
[0023] The dynamic basic weights of each candidate feature word in the evolution chain are expanded according to time windows, and a multi-dimensional time weight vector is constructed based on the transition rules between windows.
[0024] A cross-event coupling parameter is introduced into the multi-dimensional time weight vector. The cross-event coupling parameter is obtained by detecting the semantic overlap and divergence positions between different event paths.
[0025] The time weight vector is combined with the cross-event coupling parameter layer by layer to form an intermediate weight network with a hierarchical structure, and cross-event cross-indexes are generated within the intermediate weight network.
[0026] Based on the cross-indexing of the intermediate weight network, a risk feature word weight matrix is generated, which represents multiple information of the time evolution dimension and the event coupling dimension.
[0027] Specifically, the aforementioned public opinion corpus is input into a pre-trained language model. Using the feature word risk weight matrix as a mandatory constraint, a cross-level fusion matching of global semantic representation and local feature weights is performed to obtain the first risk screening result, including:
[0028] The public opinion corpus is input into a pre-trained language model to generate a global semantic representation covering the entire text, and a context position index is added to the global semantic representation.
[0029] The risk feature word weight matrix is mapped to a constraint vector field, and this constraint vector field is superimposed on the context position index of the global semantic representation to form a semantically constrained corpus representation;
[0030] In the semantically constrained corpus representation, local segments are selected and targeted aggregation is performed by applying risk feature word weights to them, so that local semantic segments and risk feature words form a one-to-one correspondence.
[0031] The global semantic representation is aligned across levels with the aggregated results of local feature weights, and a fused semantic matrix is generated through multi-level iterative comparison.
[0032] Based on the fused semantic matrix, the first risk screening result is extracted.
[0033] Specifically, based on the fused semantic matrix, the first risk screening result is extracted, including:
[0034] In the fused semantic matrix, preliminary risk index lists are generated along the semantic dimension and the feature word dimension respectively to form a set of candidate risk segments;
[0035] The candidate risk segment set is subjected to multi-level cross-comparison, and a mapping link between segments is established based on the continuity of semantic context and the concentration of feature word weights.
[0036] The mapping link is transformed into a hierarchical risk structure graph, and the set of risk nodes with the highest semantic cross-layer connectivity is automatically identified in the structure graph;
[0037] Based on the set of risk nodes, a first risk screening result is generated.
[0038] Specifically, the first risk screening result is validated a second time by calculating a dynamic confidence score based on the evolution trajectory of risk feature words, and then performing a dual-track cross-correction with the first risk screening result to obtain the final public opinion screening matching output, including:
[0039] Extract the corresponding set of feature words from the first risk screening results, and retrieve multiple historical segments of the set in the evolutionary trajectory along the time dimension;
[0040] Based on the historical fragments, a temporal change curve of the feature words is constructed, and semantic bifurcation points and semantic convergence points are marked in the temporal change curve to generate a dynamic confidence vector.
[0041] The dynamic confidence vector is mapped layer by layer with the first risk screening result to form a dual-track comparison framework of semantic matching channel and feature evolution channel;
[0042] Within the dual-track alignment framework, a set of nodes that maintain cross-track consistency is obtained by iteratively filtering and removing segments that do not meet the consistency requirement.
[0043] The final public opinion screening and matching results are output based on the set of consistent nodes.
[0044] Specifically, the dynamic confidence vector is mapped layer by layer with the first risk screening result to form a dual-track comparison framework with semantic matching channel and feature evolution channel, including:
[0045] The dynamic confidence vector is expanded according to the time hierarchy to generate a time index mapping table;
[0046] The semantic representation of the feature words in the first risk screening result is decomposed into layers, and a semantic index mapping table is generated with semantic depth as the dimension.
[0047] The time index mapping table and the semantic index mapping table are cross-coupled. During the coupling process, the corresponding semantic level is matched for each time segment to form an initial mapping matrix.
[0048] Based on the initial mapping matrix, the semantic matching channel and the feature evolution channel are distinguished, and a bidirectional interactive link is established between the two to construct a dual-track comparison framework.
[0049] Specifically, within the dual-track alignment framework, a set of nodes that maintain cross-track consistency is obtained by iteratively filtering and removing segments that do not meet the consistency requirement, resulting in a set of consistent nodes, including:
[0050] In the dual-track alignment framework, candidate nodes in the semantic matching channel and the feature evolution channel are labeled respectively, and a cross-track index is assigned to each candidate node;
[0051] Based on the cross-track index, the coupling relationship between nodes in the two channels is detected, and cross-track consistency candidate pairs are constructed according to the coupling strength;
[0052] The consensus candidate pairs are subjected to multiple rounds of iterative screening. In each round of screening, node pairs that fail to maintain contextual coherence or evolutionary continuity are eliminated, and the set of nodes that form a stable link is retained.
[0053] The set of nodes in the stable link is further aggregated into a set of cross-track consistent nodes.
[0054] A dual-risk public opinion screening and matching system is provided to implement the aforementioned dual-risk public opinion screening and matching method, comprising: a data acquisition module, a weight matrix construction module, a first result output module, and a final result output module;
[0055] The data acquisition module is used to collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and association link construction on the collected public opinion data;
[0056] The weight matrix construction module is used to dynamically generate a set of risk feature words based on the set of public opinion corpus, and construct a risk weight matrix of risk feature words according to the intensity of event evolution;
[0057] The first result output module is used to input the public opinion corpus into a pre-trained language model, and use the feature word risk weight matrix as a mandatory constraint to perform cross-level fusion matching of global semantic representation and local feature weights to obtain the first risk screening result.
[0058] The final result output module is used to perform secondary verification on the first risk screening result, calculate the dynamic confidence score based on the evolution trajectory of risk feature words, and perform dual-track cross-correction with the first risk screening result to obtain the final public opinion screening matching output.
[0059] Compared with the prior art, the beneficial effects of the present invention are:
[0060] This invention proposes a dual-risk public opinion screening and matching method and system. First, it constructs an evolutionary trajectory for multi-source public opinion information and dynamically generates a feature word weight matrix. Then, it introduces this weight matrix as a constraint into a pre-trained language model, forming a cross-level fusion matching of global semantic representation and local feature weights. Subsequently, the initial screening results are cross-corrected under a dual-track comparison framework, finally outputting the screening and matching structure. The overall method combines the semantic understanding advantages of a large language model with the accurate characterization of feature word evolution, avoiding the biases and omissions caused by single methods. It can dynamically track risk factors as public opinion develops and reduce misjudgments and omissions through dual-channel consistency verification, achieving accurate, stable, and systematic identification of complex public opinion risks. Attached Figure Description
[0061] Figure 1 A flowchart of a dual-risk public opinion screening and matching method provided by the present invention;
[0062] Figure 2 The flowchart for generating the first risk screening result provided by this invention;
[0063] Figure 3 The flowchart for generating the final public opinion screening and matching results provided by this invention;
[0064] Figure 4 This invention provides an architecture diagram for a dual-risk public opinion screening and matching system. Detailed Implementation
[0065] The present application will now be described in detail with reference to specific embodiments. These embodiments will help those skilled in the art to further understand the present application, but do not limit the present application in any way. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application. These all fall within the protection scope of the present application.
[0066] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0067] It should be noted that, unless there is conflict, the various features in the embodiments of this application can be combined with each other, all of which are within the protection scope of this application. Furthermore, although functional modules are divided in the device schematic diagram and a logical order is shown in the flowchart, in some cases, the steps shown or described can be performed in a different order than the module division in the device or the order in the flowchart. In addition, the " The terms "first," "second," and "third" do not limit the data or execution order; they are merely used to distinguish identical or similar items with essentially the same function and purpose.
[0068] Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to limit the scope of this application. The term "and / or" as used in this specification includes any and all combinations of one or more of the associated listed items.
[0069] Example 1
[0070] Please see Figures 1-3 The present invention provides an embodiment of a dual-risk public opinion screening and matching method, comprising the following specific steps:
[0071] Step S1: Collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and association link construction on the collected public opinion data.
[0072] The specific steps of step S1 are as follows:
[0073] Step S101: Perform timestamp correction on the collected public opinion data.
[0074] In this embodiment, the collected public opinion data comes from various heterogeneous channels, including news portals, social platforms, and short video text transcription; a unified format timestamp label is attached to all collected data; the specific correction process is as follows: the collection time is compared with the original time information provided by the source platform. If there is a deviation, the timestamp is adjusted according to the preset network latency reference table and cross-source alignment rules so that it is mapped to the same global time base.
[0075] Step S102: Based on the implicit correlation of the corrected public opinion data, generate event indication fragments and aggregate event indication fragments belonging to the same event clue into an initial event set.
[0076] In this embodiment, the text corpus is segmented into the smallest semantic units, and potential causal relationships, topic co-occurrence, and entity referential links are extracted by combining contextual information. By comparing the semantic similarity and semantic burstiness indicators between different corpora within the same time window, semantic segments that may point to the same event are identified. The segments with temporal continuity and semantic relevance are defined as event indicator segments. Segments with high semantic overlap or common entity tags within the same time period are labeled as candidate sets. Through hierarchical clustering, these candidate segments are gradually merged until an initial set that can reflect complete event clues is formed.
[0077] Step S103: Introduce a semantic connection index into the initial set of events, connect adjacent event fragments in context, and generate an event semantic path containing causal chains.
[0078] In this embodiment, within the initial event set, each segment is semantically labeled and its event elements are parsed to identify trigger words, subjects, objects, and time markers. Subsequently, by comparing the matching degree of adjacent segments on these elements, a semantic connection index is generated. This index is not a simple similarity measure, but rather comprehensively considers causal triggering relationships, contextual connection probabilities, and entity continuity to establish one-to-many or many-to-one logical links between segments. Each segment is embedded in an association network and has contextual interaction relationships with other segments. Based on the chronological order, segments with causal triggering markers are prioritized for concatenation, and then segments with semantic supplementary relationships or entity continuity relationships are secondary concatenated, ultimately generating a semantic path covering the entire event set.
[0079] Step S104: Sort the event semantic paths according to chronological order and semantic coherence to obtain the sorted event semantic paths.
[0080] In this embodiment, the timestamp information carried by each segment in the semantic path is extracted and initially sorted according to the chronological order. On this basis, for segments that overlap or are close in time, their contextual coherence is further calculated. The calculation of coherence depends on the semantic consistency, causal triggering association and entity continuation relationship between segments. Through the dual comparison of time and semantics, the segments that originally overlapped or were parallel are reasonably rearranged to form an ordered path that conforms to the natural evolution of events.
[0081] Step S2: Based on the aforementioned public opinion corpus, dynamically generate a set of risk feature words, and construct a risk weight matrix for the risk feature words according to the intensity of event evolution.
[0082] The specific steps of step S2 are as follows:
[0083] Step S201: Automatically extract a set of semantic fragments from the sorted event semantic path, and perform hierarchical clustering within different time windows to generate multi-level semantic clusters.
[0084] In this embodiment, semantic unit vector representations are formed by extracting the topic words, core predicates, and associated entities from the fragments. Then, multiple windows are set in the time dimension, and the semantic fragments are divided into the corresponding windows according to their time intervals. Within each time window, the semantic fragments are classified in a multi-level manner using hierarchical clustering. The first-level clustering focuses on the topic similarity between fragments, aggregating fragments with highly overlapping semantics into primary semantic clusters. On this basis, the second-level clustering further introduces contextual logical relationships and entity relevance, merging and subdividing the primary semantic clusters to form multi-level semantic clusters.
[0085] Step S202: Within the multi-level semantic cluster, by iteratively comparing semantic similarity and semantic salience, a set of candidate feature words is gradually selected and mapped to the corresponding event semantic path.
[0086] In this embodiment, high-frequency words and key entity words are counted within each semantic cluster, and an initial vocabulary is generated by combining the context vector of the segment in which they are located. Subsequently, semantic similarity comparison is performed on the words in the initial vocabulary, and words with similar or redundant meanings are merged to reduce the repetition in the candidate set. This process adopts an iterative approach, and the cluster center of the words is updated in each round to ensure that the words retained at the end can accurately reflect the theme features of the semantic cluster. By comparing the differences in word distribution of the same semantic cluster in adjacent time windows, words that appear rapidly or whose frequency increases significantly in a short period of time are identified. These emerging words are combined with stable words that have been screened for similarity to form a candidate feature word set, and further mapped to the corresponding event semantic path, so that each path carries its key semantic features.
[0087] Step S203: Based on the candidate feature word set, construct a feature word evolution chain across time periods, and dynamically adjust the basic weights of the feature words according to the extension strength and bifurcation frequency of the evolution chain to obtain dynamic basic weights.
[0088] In this embodiment, feature words within different time windows are aligned and compared to identify terms that continue to appear across windows or have high semantic consistency. Then, using chronological order as the main axis, these continuing terms are connected to form a feature word evolution chain across time periods. In this chain, each node corresponds to the performance of a feature word in a specific time period, and the connections between nodes reflect the persistence or variability of the feature word over time. In this way, the originally scattered feature words are organized into a sequence structure with a temporal context, revealing the potential evolutionary trajectory of risk vocabulary in the development of events. Simultaneously, while constructing the evolutionary chain... In this process, the basic weights of feature words are dynamically adjusted based on the extension strength and forking frequency of the chain. Specifically, if a feature word appears in multiple consecutive time windows, its chain extension strength is high, and the basic weight of the feature word is increased accordingly during the iteration process. If a feature word shows frequent forking in the chain, that is, it derives multiple semantic directions in different paths, it indicates that it has strong uncertainty in the evolution of public opinion, and its basic weight is appropriately modified to reflect its risk potential and complexity. Through this dynamic adjustment process, the basic weights that can truly reflect the temporal evolution characteristics of feature words are finally obtained.
[0089] Step S204: The dynamic basic weights and the cross-event coupling parameters are superimposed and calculated to form a risk feature word weight matrix.
[0090] The specific steps of step S204 are as follows:
[0091] Step S2041: Expand the dynamic basic weights of each candidate feature word in the evolution chain according to the time window, and construct a multi-dimensional time weight vector based on the transition rules between windows.
[0092] In this embodiment, the weight values of the same feature word within consecutive time periods are arranged sequentially according to the order of time windows to form a weight sequence. This sequence can intuitively reflect the intensity and trend of the feature word's appearance in the process of public opinion evolution, expanding the single weight data into a multi-point record that changes over time. Furthermore, the transfer relationship between time windows is characterized by introducing the transfer rules between windows based on the weight sequence. Specifically, the frequency difference and semantic continuity of the feature word in adjacent windows are statistically analyzed, and transfer links are established between the nodes of the weight sequence based on this. In this way, the weight values in different time windows are no longer isolated points, but are connected into a multi-dimensional structure through transfer relationships, ultimately constructing a multi-dimensional time weight vector that includes time continuity and trend of change.
[0093] Step S2042: Introduce a cross-event coupling parameter into the multi-dimensional time weight vector. The cross-event coupling parameter is obtained by detecting the semantic overlap and divergence positions between different event paths.
[0094] In this embodiment, feature word sets from different event semantic paths are compared to identify overlapping parts in semantic expression, such as shared entities, repeated topics, or similar contextual semantics. Simultaneously, divergence points between these paths are marked, including semantic differences, opposing viewpoints, and broken entity associations. Through parallel detection of overlap and divergence, cross-features between events at the semantic level are obtained. After obtaining the overlap and divergence information, it is transformed into quantifiable coupling parameters and embedded into a multi-dimensional time weight vector. Specifically, at each time node, the weight corresponding to that node is jointly modeled with the overlap and divergence of other event paths within the same time segment, thereby forming a correction value with cross-event dimensions. This correction value is combined with the original time weight to form an expanded vector representation.
[0095] Step S2043: Combine the time weight vector with the cross-event coupling parameter layer by layer to form an intermediate weight network with a hierarchical structure, and generate cross-event cross-indexes within the intermediate weight network.
[0096] In this embodiment, a time window is used as the base layer, and the weight vectors of each feature word within the corresponding time period are processed into nodes. A coupling parameter is superimposed on it as the edge weight information across events, which is used to describe the correlation strength between different event paths. Through this hierarchical embedding method, the originally independent time weight vectors are gradually combined into a network framework with a hierarchical structure. This framework includes both vertical time series relationships and horizontal cross-event connections, forming an intermediate weight network. After the intermediate weight network is constructed, a cross-event index is generated within it. Specifically, feature word nodes that appear in multiple event paths in the network are searched, and their relative positions and weight differences in different paths are recorded to form a cross-index table. At the same time, for nodes that appear at divergent positions, their bifurcation direction and distribution density are marked in the index table, and finally, the cross-index is obtained.
[0097] Step S2044: Based on the cross-index of the intermediate weight network, generate a risk feature word weight matrix, which represents multiple information of the time evolution dimension and the event coupling dimension.
[0098] In this embodiment, the nodes in the intermediate weight network are expanded in chronological order, and the distribution of the same feature word in different event paths is normalized by combining the cross-event mapping relationship recorded in the cross-index. Subsequently, the normalization results are sequentially filled into the time dimension axis of the matrix to ensure that each row can fully represent the dynamic changes of the feature word under multiple time windows. At the same time, the coupling degree information of the cross-index is introduced in the column direction to establish the correspondence between cross events. Through this bidirectional mapping method, a preliminary matrix prototype containing both time and event dimensions is formed. After obtaining the preliminary matrix, the interactive characteristics of time evolution and event coupling are enhanced by multi-dimensional combination operations within the matrix. Specifically, the continuous weight values in the time dimension and the coupling parameters in the event dimension are superimposed and weighted layer by layer to obtain a comprehensive matrix structure that reflects the risk evolution trend and cross-event semantic linkage, and finally, a risk feature word weight matrix is generated.
[0099] Step S3: Input the set of public opinion data into the pre-trained language model, and use the feature word risk weight matrix as a mandatory constraint to perform cross-level fusion matching of global semantic representation and local feature weights to obtain the first risk screening result.
[0100] like Figure 2 As shown, the specific steps of step S3 are as follows:
[0101] Step S301: Input the public opinion corpus into the pre-trained language model to generate a global semantic representation covering the entire text, and add a context position index to the global semantic representation.
[0102] In this embodiment, the corpus is vectorized so that it is received by the language big model and mapped to a high-dimensional semantic space. In this space, the big model captures the contextual dependencies between different corpus fragments through its multi-layer semantic coding structure and generates a unified representation vector that can cover the entire text. In order to enhance the operability of the global semantic representation in context retrieval and precise positioning, a context position index is added to it. Specifically, in the generated global semantic representation, the starting position, context span, and correlation between adjacent fragments of each fragment are embedded in the corresponding semantic unit in the form of an index, according to the order of the original corpus.
[0103] Step S302: Map the risk feature word weight matrix to a constraint vector field, and superimpose the constraint vector field onto the context position index of the global semantic representation to form a semantically constrained corpus representation.
[0104] In this embodiment, the weight of each feature word in the matrix is vectorized and encoded, mapping numerical information from different time periods and event dimensions into vector tensors. Then, a field function is generated in the semantic space according to the distribution density and coupling relationship of the feature words, enabling the vector corresponding to each feature word weight to form a local constraint region in the semantic space. These regions collectively constitute a constraint vector field, which guides the convergence direction of the global semantic representation. After obtaining the constraint vector field, it is superimposed onto the context position index of the global semantic representation. Specifically, using the context index as an anchor point, the semantic units at the corresponding positions are matched one by one with the local weight vectors in the constraint vector field, and weight adjustments are performed on the overlapping parts. This allows the global semantic representation to explicitly reflect the influence of risk feature words in local segments. Through this superposition process, the original global semantic representation, which only had overall coherence, is transformed into a semantically constrained corpus representation.
[0105] Step S303: In the semantically restricted corpus representation, select local segments and apply targeted aggregation of risk feature word weights to them, so that the local semantic segments and risk feature words form a one-to-one correspondence.
[0106] In this embodiment, the global semantic representation is segmented based on the context position index, dividing the text into multiple local regions with independent semantic units. Then, within each local region, the correspondence between it and the risk feature word weight matrix is retrieved. By calculating the semantic matching degree and weight constraint strength, the most representative segments are selected as candidate local units. After the local segments are selected, a targeted aggregation operation is performed. Specifically, the semantic vectors in the candidate segments are superimposed one by one with their corresponding risk feature word weights, ensuring that the semantic representation of each local segment carries a clear risk feature attribute. During the aggregation process, a targeted mapping mechanism is introduced to ensure that the weights of risk feature words preferentially influence segments related to their semantics, rather than being evenly distributed across all local units. The final result is that a one-to-one correspondence is formed between local semantic segments and risk feature words, constituting a local representation sequence with risk feature labels.
[0107] Step S304: Align the global semantic representation with the local feature weight aggregation result across levels, and generate a fused semantic matrix through multi-level iterative comparison.
[0108] In this embodiment, the global semantic representation is decomposed into semantic units of different dimensions, such as the topic layer, context layer, and entity layer. Simultaneously, the aggregated results of local feature weights are hierarchically encoded according to the weight of feature words and their semantic relevance. Then, during the alignment process, using the global semantic representation as a framework, each semantic unit is matched with its corresponding local feature representation, and the mapping relationship is determined based on weight priority, achieving a correspondence between the macroscopic semantic structure and the fine-grained elements of risk features. After alignment, a fused semantic matrix is generated through multi-level iterative comparison. Specifically, based on the initial alignment, the semantic consistency between global units and local features is repeatedly compared iteratively. In each round of comparison, the distribution of local weights is dynamically adjusted, gradually harmonizing the global framework and local constraints. After multi-level iterations, the fused semantic matrix is obtained.
[0109] Step S305: Extract the first risk screening result based on the fused semantic matrix.
[0110] The specific steps of step S305 are as follows:
[0111] Step S3051: In the fused semantic matrix, generate preliminary risk index lists along the semantic dimension and the feature word dimension respectively to form a set of candidate risk segments.
[0112] In this embodiment, at the semantic dimension, semantic units in the matrix are retrieved one by one to identify segments with prominent contextual coherence and causal orientation in the global representation, and these segments are included in the semantic index list. Simultaneously, at the feature word dimension, feature word nodes with high weight values in the matrix are extracted, and a feature word index list is generated according to their distribution in different semantic levels. After the semantic index list and the feature word index list are formed, they are cross-compared, and segments appearing in both lists are marked as candidate risk segments. Specifically, the context segments corresponding to the semantic index list are matched one by one with the high-weight nodes corresponding to the feature word index list, and their positions and association paths in the matrix are recorded. In this way, scattered semantic units and risk feature points are aggregated into a set of candidate risk segments.
[0113] Step S3052: Perform multi-level cross-comparison on the candidate risk segment set, and establish a mapping link between segments based on the continuity of semantic context and the concentration of feature word weights.
[0114] In this embodiment, the contextual coherence between segments in the set is detected one by one, including factors such as chronological order, continuity of semantic roles, and causal triggering relationships. When two segments exhibit continuous or complementary features in semantic structure, they are determined to be a segment pair with contextual continuity. At the same time, the concentration of feature word weights is introduced as a screening factor during the comparison process, and segments with highly concentrated feature word weights are given higher connection priority to ensure that the constructed mapping link can highlight the semantic association of high-risk elements. The cross-comparison process is not completed in one go, but gradually converges through multi-level iterations. In the initial stage, contextual continuity is the main factor to form basic segment connection relationships. In the subsequent stage, the existing connection relationships are weighted and adjusted hierarchically by combining the concentrated distribution of feature word weights. After multiple rounds of comparison and correction, the mapping link is finally formed.
[0115] Step S3053: Convert the mapping link into a hierarchical risk structure graph, and automatically identify the set of risk nodes with the highest semantic cross-layer connectivity in the structure graph.
[0116] In this embodiment, the segment nodes in the mapping link are used as the basic unit. Based on the semantic category and time window to which the segment belongs, the nodes are divided into different levels, and the causal and connection relationships between the upstream and downstream are maintained between the levels. Then, the cross-layer mapping link is used as the edge to connect the nodes in different levels that have semantic dependencies or feature word weight associations, gradually forming a multi-layer graph structure from low-level semantic segments to high-level risk semantics. After the hierarchical risk structure graph is constructed, it is also necessary to automatically identify the set of risk nodes with the highest semantic cross-layer connectivity in the graph. Specifically, the cross-layer connection of all nodes in the graph is traversed and calculated, and the number of connections, connection strength and associated feature word weights of each node between different levels are counted. Nodes with high connectivity often correspond to semantic segments that have risk orientation in multiple levels, and they are marked as core risk nodes. Finally, the set of risk nodes is output.
[0117] Step S3054: Generate a first risk screening result based on the set of risk nodes.
[0118] In this embodiment,
[0119] Step S4: Perform secondary verification on the first risk screening result, calculate the dynamic confidence score based on the evolution trajectory of risk feature words, and perform dual-track cross-correction with the first risk screening result to obtain the final public opinion screening matching output.
[0120] like Figure 3 As shown, the specific steps of step S4 are as follows:
[0121] Step S401: Extract the corresponding set of feature words from the first risk screening result, and retrieve multiple historical segments of the set in the evolutionary trajectory along the time dimension.
[0122] In this embodiment, by retrieving high-weight semantic units from the risk screening results, core feature words closely related to risk events are identified and combined into a set. Subsequently, using this set as the search criteria, the previously constructed public opinion evolution trajectory is entered to match and filter all relevant segments on the timeline. During the search process, not only should the direct occurrence of feature words in different time windows be considered, but a semantic expansion mechanism should also be introduced to merge terms that have been deformed due to differences in expression. For example, if the same feature word appears in the form of synonyms or near-synonyms at different stages, it should be included in the same search scope through semantic comparison. Through this extensive comparison and integration, a set of historical segments with higher coverage and stronger semantic consistency is obtained in the time dimension.
[0123] Step S402: Based on the historical fragments, construct the temporal change curve of the feature words, and mark the semantic bifurcation point and semantic convergence point in the temporal change curve to generate a dynamic confidence vector.
[0124] In this embodiment, the frequency of occurrence, semantic strength of context, and weight information of associated event paths of each feature word in different time windows are extracted and arranged in chronological order to form a temporal change curve of the feature words. On this curve, each node corresponds to the performance of the feature word in a specific time period, and the connection between nodes reflects the trend of feature word evolution over time. In order to more comprehensively reflect the dynamic changes of risk features, semantic bifurcation points and semantic convergence points are marked on the temporal change curve. Semantic bifurcation points usually correspond to the new meanings or associated events derived by the feature word in different time periods, while semantic convergence points reflect the situation where the semantics of multiple events gradually converge to the same core expression. By identifying and labeling these key nodes, key features reflecting the evolutionary complexity of feature words are extracted from the temporal curve. Subsequently, these features are converted into numerical representations and combined to generate a dynamic confidence vector.
[0125] Step S403: Map the dynamic confidence vector to the first risk screening result layer by layer to form a dual-track comparison framework of semantic matching channel and feature evolution channel.
[0126] The specific steps of step S403 are as follows:
[0127] Step S4031: Expand the dynamic confidence vector according to the time hierarchy to generate a time index mapping table.
[0128] In this embodiment, each element in the dynamic confidence vector is deconstructed, with each element corresponding to the feature word risk expression intensity under a specific time window. Subsequently, using the time window as the basic unit, these elements are arranged sequentially, and a sequential index is established between different levels to reflect the evolutionary relationship from near to far and from local to global. During the time-level expansion process, the confidence of different time segments needs to be indexed and marked. Specifically, in the index table of each level, not only is the confidence value corresponding to the time period recorded, but the connection relationship between the current and previous windows is also marked, including increasing trends, decreasing trends, and identifiers for semantic bifurcation or convergence. Through the above processing, a time index mapping table is finally generated.
[0129] Step S4032: Decompose the semantic representation of the feature words in the first risk screening result into layers, and generate a semantic index mapping table with semantic depth as the dimension.
[0130] In this embodiment, a semantic parsing method is used to divide the semantic representation of each feature word into different layers, including a basic semantic layer, a context-dependent layer, and a cross-segment association layer. Subsequently, during the decomposition process, the semantic elements of each layer are extracted and mapped to different dimensions, so that the same feature word is represented at multiple semantic depths. After completing the hierarchical decomposition, a semantic index mapping table needs to be generated based on the semantic depth. Specifically, the semantic units corresponding to the basic semantic layer are placed in the initial layer of the index table, the context-dependent layer is placed in the middle layer, and the cross-segment association layer is placed in the deep layer, and an index mapping relationship is established between the layers. Each semantic unit not only records its semantic content in the table, but also marks the connection path between it and other layer units, and finally obtains the semantic index mapping table.
[0131] Step S4033: Cross-couple the time index mapping table with the semantic index mapping table. During the coupling process, match the corresponding semantic level for each time segment to form an initial mapping matrix.
[0132] In this embodiment, the sequence of each time segment in the time index mapping table is parsed to clarify the position of each time node in the overall evolutionary trajectory. At the same time, the semantic index mapping table is expanded in layers to extract multi-level semantic elements such as the basic semantic layer, the context dependency layer, and the cross-segment association layer. Subsequently, in the coupling process, with the time segment as the main axis, each time node is paired with its corresponding semantic level to ensure that the depth distribution of semantics can be mapped to the continuous time evolution sequence. After completing the one-to-one pairing of time and semantics, a unified data matrix representation is established in the cross-coupling. Specifically, the time segments are marked in the row direction of the matrix, and the semantic levels are marked in the column direction. Each pairing result is mapped to a cell in the matrix, and weights or index values are filled according to the correspondence between the semantic layer and the time segment to finally form the initial mapping matrix.
[0133] Step S4034: Based on the initial mapping matrix, distinguish between the semantic matching channel and the feature evolution channel, and establish a bidirectional interactive link between the two to construct a dual-track comparison framework.
[0134] In this embodiment, semantic hierarchical indexes are aggregated in the matrix to extract row and column sub-blocks related to semantic logic, which are then marked as semantic matching channels. Simultaneously, the change patterns of time segments in the evolutionary chain are extracted, identifying weight mutations, bifurcation, and convergence regions, which are then marked as feature evolution channels. Through this division, the initial mapping matrix is decomposed into two complementary analysis channels: one focusing on semantic depth comparison, and the other on time series evolution. After channel separation, a bidirectional interactive link is established between the semantic matching channel and the feature evolution channel. Specifically, nodes with common feature word indices or overlapping time windows are selected in the two channels, and these nodes are used as anchors to establish an interactive relationship. Subsequently, bidirectional indexes are added to these anchors in the matrix structure, enabling the semantic channel to retrieve the corresponding time evolution path, and the time channel to trace back to the relevant semantic level. Through this bidirectional interactive mechanism, semantic and temporal information form a cyclical verification relationship in the comparison framework, ultimately constructing a dual-track comparison framework.
[0135] Step S404: Within the dual-track alignment framework, the set of nodes that maintain cross-track consistency is iteratively filtered, and segments that do not meet the consistency requirement are removed to obtain a set of consistent nodes.
[0136] The specific steps of step S404 are as follows:
[0137] Step S4041: In the dual-track alignment framework, candidate nodes in the semantic matching channel and the feature evolution channel are marked respectively, and a cross-track index is assigned to each candidate node.
[0138] In this embodiment, in the semantic matching channel, based on the hierarchical results of the semantic index mapping table, feature words or semantic units that exhibit high relevance across multiple semantic levels are selected as candidate nodes. In the feature evolution channel, nodes with significant evolutionary characteristics in temporal changes are extracted based on the bifurcation and convergence points marked in the time index mapping table. In this way, independent candidate node sets are formed within both channels, reflecting key units at the semantic logic level and key segments at the time evolution level, respectively. In the candidate node set, indexing rules are established according to the unique identifier of the feature word and the position of the time segment, generating index codes of the same format for nodes in both the semantic and time channels. At the same time, the index code is bound to the original context position of the node to ensure one-to-one retrieval during cross-track comparison, ultimately forming a cross-track indexing system.
[0139] Step S4042: Based on the cross-track index, detect the coupling relationship between nodes in the two channels, and construct cross-track consistency candidate pairs according to the coupling strength.
[0140] In this embodiment, the semantic hierarchy information and contextual position of each candidate node are extracted in the semantic channel; the temporal segment position and evolutionary trajectory features of nodes under the same index are extracted in the feature evolution channel; then, using the cross-track index as a reference standard, semantic nodes and temporal nodes are paired, and the semantic consistency and temporal continuity of the paired nodes are tested in two dimensions; if the nodes maintain a high degree of overlap in semantic representation and have a corresponding relationship in the temporal evolution path, it is determined that there is a potential coupling relationship between the two; for each pair of nodes, its comprehensive coupling score in the two dimensions of semantic overlap and evolutionary continuity is calculated, and the candidate node pairs are sorted according to the score. Subsequently, node pairs with scores exceeding a preset threshold are marked as cross-track consistency candidate pairs, and corresponding connection structures are established within the comparison framework to obtain cross-track consistency candidate pairs.
[0141] Step S4043: Perform multiple rounds of iterative screening on the consensus candidate pairs. In each round of screening, remove node pairs that fail to maintain contextual coherence or evolutionary continuity, and retain the set of nodes that form a stable link.
[0142] In this embodiment, the context between each pair of nodes in the candidate pair set is compared to examine their consistency in semantic continuity and temporal coherence. If a node pair has a significant break in semantic logic or exhibits a jump in the time sequence, it is directly eliminated in the initial screening. Subsequently, based on the retained node pairs, a temporal chain of node pairs is established according to the extension relationship of adjacent windows in the evolutionary trajectory. The evolution of the node pairs is then checked to see if it is smooth or gradually continuous. If a node pair undergoes a sudden change or loses continuity during the evolution process, it is excluded in the next screening. The logic of multi-round iterative screening is to gradually converge to a stable set of nodes. Each round of screening performs finer-grained corrections based on the results of the previous round, starting with macro-level semantic coherence and then moving to local evolutionary consistency, tightening the screening criteria layer by layer. After several rounds of iteration, a stable set of links is finally formed.
[0143] Step S4044: Further aggregate the set of nodes of the stable link into a set of cross-track consistent nodes.
[0144] In this embodiment, nodes in the stable link are categorized according to their respective channels: one category originates from the semantic matching channel, and the other from the feature evolution channel. Subsequently, in the categorized node set, nodes are compared based on cross-track indexes to detect overlaps in semantic location, time window, and feature weight distribution. If multiple nodes exhibit high consistency or high coupling in the above dimensions, they are grouped into a cross-track corresponding group to achieve one-to-one or one-to-many aggregation between nodes. After completing the initial grouping, the cross-track corresponding groups need to be merged and optimized. Specifically, within the same group, the connectivity between nodes is calculated, and redundant nodes are removed based on the connectivity strength, retaining only the nodes that play a core role in the cross-track comparison. If there are high-frequency co-occurrences or logical dependencies between different groups, they are further merged into a higher-level cross-track node set, ultimately forming a cross-track consistent node set.
[0145] Step S405: Output the final public opinion screening and matching results based on the set of consistent nodes.
[0146] In this embodiment, the consistent node set is structurally expanded, and each node in the set is reordered according to the time dimension and semantic level to ensure that its logical order reflects the complete path of public opinion evolution. Subsequently, based on the sorting, the interaction relationship between nodes is analyzed to identify their dual orientation in the semantic channel and feature channel, and these orientations are mapped as candidate identifiers of risk topics. Within each risk topic, the feature weights and semantic strengths of all relevant nodes are summarized, and a unified risk label is generated through weighted integration. At the same time, nodes that appear repeatedly in different tracks are given priority to be included in the final output. After the above processing, the output public opinion screening and matching results reflect the dual constraints of feature word evolution and semantic logic, realizing the systematization and accuracy of risk identification as a whole.
[0147] Example 2
[0148] Please see Figure 4 Another embodiment of the present invention provides: a dual-risk public opinion screening and matching system, comprising: a data acquisition module, a weight matrix construction module, a first result output module and a final result output module;
[0149] The data acquisition module is used to collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and association link construction on the collected public opinion data;
[0150] The weight matrix construction module is used to dynamically generate a set of risk feature words based on the set of public opinion corpus, and construct a risk weight matrix of risk feature words according to the intensity of event evolution;
[0151] The first result output module is used to input the public opinion corpus into a pre-trained language model, and use the feature word risk weight matrix as a mandatory constraint to perform cross-level fusion matching of global semantic representation and local feature weights to obtain the first risk screening result.
[0152] The final result output module is used to perform secondary verification on the first risk screening result, calculate the dynamic confidence score based on the evolution trajectory of risk feature words, and perform dual-track cross-correction with the first risk screening result to obtain the final public opinion screening matching output.
[0153] In addition, the parts of the technical solutions provided in the embodiments of this application that are consistent with the implementation principles of the corresponding technical solutions in the prior art have not been described in detail, so as to avoid excessive elaboration.
[0154] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the invention. Any modifications, equivalent substitutions, or improvements made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A dual risk public opinion screening matching method, characterized in that, include: Collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and constructing correlation links on the collected public opinion data; Based on the aforementioned public opinion corpus, a set of risk feature words is dynamically generated, and a risk weight matrix of risk feature words is constructed according to the intensity of event evolution. The aforementioned public opinion corpus is input into a pre-trained language model. Using the feature word risk weight matrix as a mandatory constraint, a cross-level fusion matching of global semantic representation and local feature weights is performed to obtain the first risk screening result. The first risk screening result is verified a second time. The dynamic confidence score based on the evolution trajectory of risk feature words is calculated and cross-corrected with the first risk screening result to obtain the final public opinion screening matching output.
2. A dual risk public sentiment screening matching method according to claim 1, characterized in that, The construction of the public opinion corpus includes: Timestamp correction is performed on the collected public opinion data; Based on the implicit correlations in the corrected public opinion data, event indication fragments are generated, and event indication fragments belonging to the same event clue are aggregated into an initial event set. A semantic connection index is introduced into the initial set of events to connect adjacent event fragments in context and generate an event semantic path containing causal chains; The event semantic paths are sorted according to their chronological order and semantic coherence to obtain the sorted event semantic paths.
3. A dual risk public sentiment screening matching method according to claim 1, characterized in that, Based on the aforementioned public opinion corpus, a set of risk feature words is dynamically generated, and a risk weight matrix for the risk feature words is constructed according to the intensity of event evolution, including: Automatically extract a set of semantic fragments from the sorted event semantic path, and perform hierarchical clustering within different time windows to generate multi-level semantic clusters; Within the multi-level semantic cluster, candidate feature word sets are gradually selected by iteratively comparing semantic similarity and semantic salience, and then mapped to the corresponding event semantic path; Based on the candidate feature word set, a feature word evolution chain across time periods is constructed, and the basic weights of the feature words are dynamically adjusted according to the extension strength and bifurcation frequency of the evolution chain to obtain dynamic basic weights; The dynamic basic weights are superimposed with the cross-event coupling parameters to form a risk feature word weight matrix.
4. A dual risk public sentiment screening matching method according to claim 3, characterized in that, The dynamic basic weights are superimposed with the cross-event coupling parameters to form a risk feature word weight matrix, including: The dynamic basic weights of each candidate feature word in the evolution chain are expanded according to time windows, and a multi-dimensional time weight vector is constructed based on the transition rules between windows. A cross-event coupling parameter is introduced into the multi-dimensional time weight vector. The cross-event coupling parameter is obtained by detecting the semantic overlap and divergence positions between different event paths. The time weight vector is combined with the cross-event coupling parameter layer by layer to form an intermediate weight network with a hierarchical structure, and cross-event cross-indexes are generated within the intermediate weight network. Based on the cross-indexing of the intermediate weight network, a risk feature word weight matrix is generated, which represents multiple information of the time evolution dimension and the event coupling dimension.
5. A dual risk public sentiment screening matching method according to claim 1, wherein, The aforementioned public opinion corpus is input into a pre-trained language model. Using the feature word risk weight matrix as a mandatory constraint, a cross-level fusion matching of global semantic representation and local feature weights is performed to obtain the first risk screening result, including: The public opinion corpus is input into a pre-trained language model to generate a global semantic representation covering the entire text, and a context position index is added to the global semantic representation. The risk feature word weight matrix is mapped to a constraint vector field, and this constraint vector field is superimposed on the context position index of the global semantic representation to form a semantically constrained corpus representation; In the semantically constrained corpus representation, local segments are selected and targeted aggregation is performed by applying risk feature word weights to them, so that local semantic segments and risk feature words form a one-to-one correspondence. The global semantic representation is aligned across levels with the aggregated results of local feature weights, and a fused semantic matrix is generated through multi-level iterative comparison. Based on the fused semantic matrix, the first risk screening result is extracted.
6. A dual risk public sentiment screening matching method according to claim 5, wherein, Based on the fused semantic matrix, the first risk screening result is extracted, including: In the fused semantic matrix, preliminary risk index lists are generated along the semantic dimension and the feature word dimension respectively to form a set of candidate risk segments; The candidate risk segment set is subjected to multi-level cross-comparison, and a mapping link between segments is established based on the continuity of semantic context and the concentration of feature word weights. The mapping link is transformed into a hierarchical risk structure graph, and the set of risk nodes with the highest semantic cross-layer connectivity is automatically identified in the structure graph; Based on the set of risk nodes, a first risk screening result is generated.
7. A dual risk public sentiment screening matching method according to claim 1, wherein, The first risk screening result is then validated a second time by calculating a dynamic confidence score based on the evolution trajectory of risk feature words. This score is then cross-calibrated with the first risk screening result using a dual-track method to obtain the final public opinion screening matching output, including: Extract the corresponding set of feature words from the first risk screening results, and retrieve multiple historical segments of the set in the evolutionary trajectory along the time dimension; Based on the historical fragments, a temporal change curve of the feature words is constructed, and semantic bifurcation points and semantic convergence points are marked in the temporal change curve to generate a dynamic confidence vector. The dynamic confidence vector is mapped layer by layer with the first risk screening result to form a dual-track comparison framework of semantic matching channel and feature evolution channel; Within the dual-track alignment framework, a set of nodes that maintain cross-track consistency is obtained by iteratively filtering and removing segments that do not meet the consistency requirement. The final public opinion screening and matching results are output based on the set of consistent nodes.
8. A dual risk public sentiment screening matching method according to claim 7, characterized in that, The dynamic confidence vector is mapped layer by layer with the first risk screening result to form a dual-track comparison framework with semantic matching channel and feature evolution channel, including: The dynamic confidence vector is expanded according to the time hierarchy to generate a time index mapping table; The semantic representation of the feature words in the first risk screening result is decomposed into layers, and a semantic index mapping table is generated with semantic depth as the dimension. The time index mapping table and the semantic index mapping table are cross-coupled. During the coupling process, the corresponding semantic level is matched for each time segment to form an initial mapping matrix. Based on the initial mapping matrix, the semantic matching channel and the feature evolution channel are distinguished, and a bidirectional interactive link is established between the two to construct a dual-track comparison framework.
9. A dual risk public sentiment screening matching method according to claim 7, wherein, Within the dual-track alignment framework, a set of nodes that maintain cross-track consistency is obtained by iteratively filtering and removing segments that do not meet the consistency requirement, resulting in a set of consistent nodes, including: In the dual-track alignment framework, candidate nodes in the semantic matching channel and the feature evolution channel are labeled respectively, and a cross-track index is assigned to each candidate node; Based on the cross-track index, the coupling relationship between nodes in the two channels is detected, and cross-track consistency candidate pairs are constructed according to the coupling strength; The consensus candidate pairs are subjected to multiple rounds of iterative screening. In each round of screening, node pairs that fail to maintain contextual coherence or evolutionary continuity are eliminated, and the set of nodes that form a stable link is retained. The set of nodes in the stable link is further aggregated into a set of cross-track consistent nodes.
10. A dual risk public opinion screening matching system for implementing a dual risk public opinion screening matching method according to any one of claims 1-9, characterized in that, include: The module consists of a data acquisition module, a weight matrix construction module, a first result output module, and a final result output module. The data acquisition module is used to collect public opinion data and construct a public opinion corpus set, which is obtained by performing time-series annotation and association link construction on the collected public opinion data; The weight matrix construction module is used to dynamically generate a set of risk feature words based on the set of public opinion corpus, and construct a risk weight matrix of risk feature words according to the intensity of event evolution; The first result output module is used to input the public opinion corpus into a pre-trained language model, and use the feature word risk weight matrix as a mandatory constraint to perform cross-level fusion matching of global semantic representation and local feature weights to obtain the first risk screening result. The final result output module is used to perform secondary verification on the first risk screening result, calculate the dynamic confidence score based on the evolution trajectory of risk feature words, and perform dual-track cross-correction with the first risk screening result to obtain the final public opinion screening matching output.