A student community abnormal behavior recognition method based on big data
By constructing a dynamic spatiotemporal hypergraph sequence and performing non-negative Tucker decomposition, semantic impedance and structural entropy are calculated to generate a comprehensive anomaly score. This solves the problems of insufficient accuracy in identifying student community behavior and insufficient identification of hidden risks in existing technologies, and realizes refined security management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANDONG POLYTECHNIC COLLEGE
- Filing Date
- 2026-03-03
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies struggle to distinguish between high-frequency normal and illegal behaviors, lack an understanding of the topological characteristics of student community social networks, resulting in insufficient identification of hidden risks and a lack of differentiated abnormal behavior judgment mechanisms, making it difficult to achieve refined responses.
By constructing a dynamic spatiotemporal hypergraph sequence, performing non-negative Tucker decomposition, calculating semantic impedance values and structural entropy, and combining behavioral intensity vectors and node degree distributions, a comprehensive anomaly score is generated to distinguish between explicit violations and implicit structural anomalies.
It enables refined identification of abnormal student behavior, allowing for early detection of potential hidden risks, providing differentiated management interventions, and improving the accuracy and response efficiency of safety management.
Smart Images

Figure CN122286321A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of big data analysis and public safety monitoring technology, specifically a method for identifying abnormal behavior in student communities based on big data. Background Technology
[0002] With the advancement of smart campus construction, students' various activities on campus generate massive amounts of multi-source heterogeneous data, including consumption records, access control data, library borrowing records, and online logs. Analyzing and monitoring student behavior using this data is crucial for timely detection of potential safety hazards, addressing student mental health concerns, and maintaining campus order.
[0003] Most existing student abnormal behavior detection technologies rely on rule-based statistical models or simple threshold judgments. These methods mainly focus on the frequency of behavior or the numerical intensity of a single dimension, lacking a deep understanding of the semantics of the behavioral context. For example, judging solely based on card swipe frequency or time spent in a location often fails to effectively distinguish between normal active behavior (such as frequent library study or club activities) and deviant, irregular active behavior (such as abnormal gatherings or loitering in prohibited areas). This makes it difficult for the system to accurately interpret the compliance behind the behavior, easily resulting in a high false alarm rate.
[0004] Furthermore, traditional methods typically focus on monitoring overt individual violations, often responding only after abnormal behavior has occurred and reached an alarm threshold, exhibiting a significant lag. These methods neglect the topological characteristics of student communities as complex interactive networks, making it difficult to perceive latent risks that have not yet manifested drastic overt behavior but occupy critical and vulnerable positions within the social network structure or show a tendency towards structural isolation. This results in insufficient early warning capabilities for potential psychological crises or group structural risks.
[0005] Meanwhile, existing technologies often employ a single-dimensional comprehensive scoring mechanism when determining anomalies, lacking a refined classification of the source and nature of the anomalies. This approach cannot distinguish between explicit anomalies caused by specific violations and implicit structural anomalies resulting from changes in social relationships. Consequently, it becomes difficult for managers to take targeted and differentiated intervention measures based on the specific nature of the risks, failing to meet the actual needs of campus safety management for refined and tiered responses. Summary of the Invention
[0006] To address the shortcomings of existing technologies, this invention provides a method for identifying abnormal behavior in student communities based on big data. This method solves the problems of existing technologies, which rely on single-frequency statistics and thus cannot distinguish between high-frequency normal and illegal behaviors, rely on explicit records and thus cannot identify implicit structural risks in the network topology in advance, and lack a differentiated judgment mechanism and thus cannot achieve refined hierarchical response to abnormal behaviors.
[0007] To achieve the above objectives, the present invention provides a method for identifying abnormal behavior in student communities based on big data, comprising the following steps: Multi-source heterogeneous data are sliced according to a preset time window, and a dynamic spatiotemporal hypergraph sequence containing nodes and hyperedges is constructed based on entity co-occurrence relationships. The dynamic spatiotemporal hypergraph sequence is mapped to a higher-order tensor and non-negative Tucker decomposition is performed to extract the core tensor. The semantic impedance value is determined based on the semantic matching degree between the hyperedge and the core tensor. Construct the behavior intensity vector of the student node to be detected, calculate the instantaneous energy dissipation value in combination with the semantic impedance value, and standardize the instantaneous energy dissipation value using the group baseline index to obtain the relative dissipation index; The global structural entropy is calculated based on the node degree distribution. A perturbation hypergraph replica is constructed by virtually removing the student node to be detected. The change in the global structural entropy before and after the removal is calculated to generate a structural sensitivity index. The relative dissipation index and the structural sensitivity index are normalized and weighted to obtain a comprehensive anomaly score. Based on the comprehensive anomaly score and the sub-indicators, explicit violation anomalies or implicit structural anomalies are determined.
[0008] Preferably, the construction of the dynamic spatiotemporal hypergraph sequence containing nodes and hyperedges specifically includes: Establish a global node index table, and define the node set as consisting of a student entity node set, a physical space node set, and an event attribute node set; For each time slice, extract all interaction records within the time slice. When multiple students appear at the same location or participate in the same activity, combine the associated student nodes, location nodes, and attribute nodes to generate a super edge. Construct an association matrix, where the element values in the association matrix represent the inclusion relationship between the hyperedge and the node. If a node is included in the hyperedge, the corresponding position in the association matrix is assigned a value of 1; otherwise, it is assigned a value of 0. Based on the set of nodes in each time slice, the generated hyperedge, and the constructed association matrix, the dynamic spatiotemporal hypergraph sequence is obtained.
[0009] Preferably, determining the semantic impedance value based on the semantic matching degree between the hyperedge and the core tensor specifically includes: The dynamic spatiotemporal hypergraph sequence is mapped to a third-order tensor, where the elements represent the co-occurrence association strength between nodes; The third-order tensor is decomposed into a modular product of the core tensor, the node factor matrix, and the time factor matrix. Extract the factor vectors of the nodes contained in the currently observed hyperedge and the time factor vector at the current moment, and calculate the degree of matching between the factor vectors and the core tensor to obtain the semantic consistency score; The semantic impedance value of the hyperedge is obtained by calculating the reciprocal of the semantic consistency score using an inverse proportional function relationship.
[0010] Preferably, the calculation of the instantaneous energy dissipation value in conjunction with the semantic impedance value specifically includes: Iterate through all event records of the student node to be tested, and assign values to the behavior intensity vector based on the duration of participation or the frequency of interaction. Calculate the product of the square of each element in the behavior intensity vector and the semantic impedance value of the corresponding hyperedge of the element, and sum all the products to obtain the instantaneous energy dissipation value.
[0011] Preferably, the standardization of the instantaneous energy dissipation value using population baseline indicators specifically includes: The instantaneous energy dissipation values of all active student nodes within the current time slice are statistically analyzed, and the population mean and standard deviation are calculated as the population baseline indicators. The relative dissipation index is obtained by subtracting the population mean from the instantaneous energy dissipation value of the student node to be detected and then dividing by the standard deviation.
[0012] Preferably, the calculation of global structural entropy based on node degree distribution specifically includes: The sum of the degrees of all nodes in the current hypergraph network is calculated, and the degree of a single node is divided by the sum of the degrees to obtain the normalized stationary distribution probability of the node. The negative value of the logarithmic weighted sum of the stationary distribution probabilities is used to obtain the global structural entropy of the current hypergraph system.
[0013] Preferably, the calculation of the change in global structural entropy before and after removal to generate a structural sensitivity index specifically includes: Construct a virtual perturbation hypergraph replica, and delete the student node to be detected and all hyperedge connections in which the student node to be detected participates in the perturbation hypergraph replica; The stationary distribution probability of the remaining nodes and the perturbed global structural entropy are recalculated based on the perturbed hypergraph replica; The absolute value of the difference between the global structural entropy before and after the perturbation is calculated and used as the structural sensitivity index.
[0014] Preferably, the normalization and weighted fusion of the relative dissipation index and the structural sensitivity index specifically includes: Max-min normalization was applied to the relative dissipation index and the structural sensitivity index, respectively. A basic noise threshold is set, and the structural sensitivity index is included in the weighted calculation only when the normalized structural sensitivity index exceeds the basic noise threshold. The comprehensive anomaly score is obtained by weighting and summing the normalized indicators using the weight coefficients of explicit behavioral features and implicit structural features.
[0015] Preferably, the determination of explicit violations specifically includes: Determine whether the relative dissipation index is greater than a preset explicit anomaly determination threshold; If the judgment result is true, an explicit violation anomaly marker is generated, and a corresponding behavioral intervention response instruction is generated.
[0016] Preferably, the determination of latent structural anomalies specifically includes: Determine whether the structural sensitivity index is greater than a preset latent anomaly detection threshold, and whether the relative dissipation index is within the normal range; If the judgment result is true, a latent structural anomaly marker is generated, and a corresponding counselor attention or investigation instruction is generated.
[0017] This invention provides a method for identifying abnormal behavior in student communities based on big data. It has the following beneficial effects: 1. This invention constructs a dynamic spatiotemporal hypergraph and performs non-negative Tucker decomposition to extract the core tensor representing the regular pattern of a group, and then calculates the semantic impedance value reflecting the degree of behavioral deviation. This transforms simple behavior frequency statistics into energy dissipation calculation with semantic weights, which can effectively distinguish between high-frequency normal activities that conform to the group's rules and high-frequency illegal activities that deviate from the rules, thus improving the semantic accuracy of abnormal behavior identification.
[0018] 2. This invention utilizes counterfactual inference logic based on global structural entropy. By simulating the removal of nodes to be detected in the computational space and evaluating the change in system entropy, it generates a structural sensitivity index. It does not rely on explicit violation records, but focuses on mining the structural importance and potential vulnerability of nodes in the social topology. Thus, it can identify key nodes with hidden risks in advance, even before students show significant abnormal behavior.
[0019] 3. This invention integrates the relative dissipation index and structural sensitivity index, and introduces a basic noise threshold to filter non-critical disturbances, thus constructing a comprehensive anomaly scoring system. Based on the characteristics of the sub-indicators, it can accurately determine the two different types of risks: explicit violation anomalies and implicit structural anomalies. This provides managers with differentiated intervention or attention instructions, and realizes refined hierarchical response for student community safety management. Attached Figure Description
[0020] Figure 1 This is a schematic diagram of the hardware operating environment and data acquisition architecture of the present invention; Figure 2 This is a schematic diagram of the overall method flow of the present invention; Figure 3 This is a system structure block diagram of the present invention.
[0021] The components are as follows: 100. Computing and processing device; 101. Central processing unit; 102. Memory; 103. Network communication interface; 200. Data acquisition module; 201. Campus card consumption data interface; 202. Access control data interface; 203. Library management data interface; 204. Academic affairs attendance data interface; 300. Data preprocessing module; 400. Data storage module; 401. Relational database unit; 402. Hypergraph structure database unit; 403. Tensor data storage unit; 501. Hypergraph construction module; 502. Impedance calculation module; 503. Dissipation analysis module; 504. Sensitivity detection module; 505. Decision module. Detailed Implementation
[0022] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0023] See attached document Figure 1 and appendix Figure 2 The present invention provides a student community abnormal behavior identification system based on big data. The system is deployed on a computing processing device 100, which is physically or network-connected to the data acquisition environment.
[0024] The computing processing device 100, acting as the execution entity, includes a central processing unit 101, a memory 102, and a network communication interface 103 at the hardware level. The central processing unit 101 interacts with the memory 102 and the network communication interface 103 via an internal system bus. The memory 102 stores computer-readable instructions, which, when executed by the central processing unit 101, implement the abnormal behavior identification logic described in this invention. The network communication interface 103 is configured to establish data transmission channels with external sensor networks and terminal devices.
[0025] The system's data input terminal is connected to the data acquisition module 200. The data acquisition module 200 includes multiple sub-interfaces distributed throughout the campus physical environment: a campus card consumption data interface 201, used to acquire transaction time and location data of students at terminals such as the cafeteria and supermarket; an access control data interface 202, used to acquire turnstile access records of students entering and exiting dormitories, laboratories, and teaching buildings; a library management data interface 203, used to acquire borrowing records and reading room seat sign-in data; and an academic affairs attendance data interface 204, used to acquire course schedules and electronic attendance records for classes.
[0026] The data preprocessing module 300 is connected to the data acquisition module 200 and is configured to clean, align, and format multi-source heterogeneous data, remove noisy data, and convert unstructured records into a standard time-series event stream. The processed data is stored in the data storage module 400.
[0027] The data storage module 400 includes: a relational database unit 401 for storing basic student information and static attribute data including geographic location coordinates; a hypergraph structure database unit 402 for storing the generated dynamic hypergraph sequence and association matrix; and a tensor data storage unit 403 for storing core tensors and factor matrix data in the process of high-order tensor decomposition.
[0028] At the functional logic level, the system includes a hypergraph construction module 501, an impedance calculation module 502, a dissipation analysis module 503, a sensitivity detection module 504, and a decision module 505, which are connected sequentially or operate in parallel. These modules are constructed by program code stored in memory 102 and executed by the central processing unit 101.
[0029] The hypergraph construction module 501 is configured to receive time-series data streams from the data preprocessing module 300. The hypergraph construction module 501 executes hyperedge generation logic, mapping student entities, location entities, and attribute entities within the same time window to node sets according to preset time window parameters. The hypergraph construction module 501 identifies co-occurrence relationships between multiple entities, generates a hyperedge structure containing multiple nodes, and constructs the corresponding hypergraph association matrix. The hypergraph construction module 501 writes the generated dynamic hypergraph sequence data into the hypergraph structure database unit 402 and outputs it to the impedance calculation module 502 and the dissipation analysis module 503.
[0030] Impedance calculation module 502 is connected to hypergraph construction module 501 and configured to calculate the physical impedance properties of each connection path in the hypergraph network. Impedance calculation module 502 uses a tensor operation unit to convert the input dynamic hypergraph sequence into a high-order tensor structure and performs non-negative Tucker decomposition to extract core behavioral motifs. Impedance calculation module 502 calculates the semantic matching degree between the currently observed hyperedge and the core behavioral motif, and generates the semantic impedance value of each hyperedge based on the reciprocal of the matching degree. Impedance calculation module 502 transmits the calculated semantic impedance data to dissipation analysis module 503.
[0031] The dissipation analysis module 503 is connected to the hypergraph construction module 501 and the impedance calculation module 502, and is configured to quantify the energy loss of individual student behaviors in the hypergraph network. The dissipation analysis module 503 extracts the behavioral characteristics of the student under test within the current time window and constructs a behavior intensity vector. Combining the semantic impedance value and the behavior intensity vector, the dissipation analysis module 503 calculates the instantaneous energy dissipation value of each student. The dissipation analysis module 503 further statistically analyzes the dissipation distribution of all active nodes within the current time window, calculates the group average dissipation baseline, and generates a normalized relative dissipation index accordingly. The dissipation analysis module 503 outputs the relative dissipation index to the decision module 505.
[0032] Sensitivity detection module 504 is connected to hypergraph construction module 501 and configured to detect latent network structure anomalies. Sensitivity detection module 504 calculates the stationary distribution probability of all network nodes and the global structural entropy of the system based on the hypergraph association matrix. Sensitivity detection module 504 executes a counterfactual inference procedure to construct a virtual perturbed hypergraph replica in the computation space after removing the student node to be detected. Sensitivity detection module 504 calculates the global structural entropy under the perturbed state and compares the entropy changes before and after the perturbed state to generate a structural sensitivity index. Sensitivity detection module 504 outputs the structural sensitivity index to decision module 505.
[0033] The decision module 505 is connected to the dissipation analysis module 503 and the sensitivity detection module 504, and is configured to generate the final anomaly identification result. The decision module 505 receives the relative dissipation index and the structural sensitivity index, performs numerical normalization and weighted fusion operations, and obtains a comprehensive anomaly score. The decision module 505 has built-in classification and decision logic, which determines the anomaly type based on the comprehensive score and the comparison results of the sub-indicators with preset thresholds. The decision module 505 distinguishes between explicit behavioral violations and implicit structural anomalies, and outputs a detection report containing anomaly type labels and confidence scores to the early warning log database or management terminal.
[0034] See attached document Figure 3 This invention provides a method for identifying abnormal behavior in student communities based on big data. This method is executed by a computing processing device 100 and specifically includes the following steps: S100. Construct a dynamic spatiotemporal hypergraph sequence using the hypergraph construction module 501. The hypergraph construction module 501 receives preprocessed multi-source heterogeneous data and slices the data stream according to a preset time window length. Within each time slice, the hypergraph construction module 501 identifies the co-occurrence relationships between multiple student, location, and event attributes, generating hyperedges connecting multiple nodes. Based on this, the hypergraph construction module 501 generates a series of static hypergraph snapshots arranged in chronological order and calculates the association matrix corresponding to each hypergraph snapshot, completing the mathematical representation of higher-order association relationships.
[0035] S200. Construct a behavioral semantic impedance network using the impedance calculation module 502. The impedance calculation module 502 maps the dynamic spatiotemporal hypergraph sequence into a high-order tensor data structure. The impedance calculation module 502 performs a non-negative Tucker decomposition operation on this high-order tensor to extract the core tensor and factor matrix representing the regular behavioral patterns of the group. The impedance calculation module 502 calculates the semantic matching degree between each hyperedge in the current hypergraph and the core tensor, and defines the semantic impedance value of each hyperedge according to the reciprocal relationship of the matching degree, thereby endowing the hypergraph network with physical impedance properties.
[0036] S300. Calculate the energy dissipation of the behavioral flow using the dissipation analysis module 503. The dissipation analysis module 503 constructs the behavioral intensity vector of the target student within the current time slice. The dissipation analysis module 503 simulates the propagation process of the behavioral flow in the hypergraph network, and calculates the instantaneous energy dissipation value generated by the behavioral flow by combining the semantic impedance values of each hyperedge. The dissipation analysis module 503 further calculates the average energy dissipation baseline of the current group under the same time slice, and uses this baseline to normalize the instantaneous energy dissipation value of the target student to obtain a relative dissipation index reflecting the degree of overt behavioral abnormality.
[0037] S400. Counterfactual hyperedge perturbation analysis is performed using the sensitivity detection module 504. The sensitivity detection module 504 calculates the global structural entropy of the current hypergraph system based on the Laplacian matrix of the hypergraph. The sensitivity detection module 504 executes counterfactual inference logic, virtually removing hyperedge connections associated with the target student during the calculation process to construct the perturbated hypergraph state. The sensitivity detection module 504 calculates the global structural entropy of the system after the perturbation and calculates the change in structural entropy before and after the perturbation, obtaining a structural sensitivity index reflecting the degree of latent structural anomalies.
[0038] S500, the decision module 505 performs a dual heterogeneity fusion decision. The decision module 505 receives the relative dissipation index and the structural sensitivity index, and calculates a comprehensive anomaly score using a weighted fusion algorithm. Based on the comprehensive anomaly score and the characteristics of the individual indices, the decision module 505 distinguishes the anomaly type. If the relative dissipation index exceeds a first threshold, the decision module 505 determines it as an explicit violation anomaly; if the structural sensitivity index exceeds a second threshold, the decision module 505 determines it as a latent structural anomaly. The decision module 505 ultimately outputs the identification result, including the anomaly type and its corresponding confidence level.
[0039] The hypergraph construction process provided by this invention is executed by the hypergraph construction module 501, which aims to transform discrete trajectory data into a mathematical structure with high-order topological properties.
[0040] The hypergraph construction module 501 first establishes a global node index table and defines the node set of the hypergraph. Node set It consists of three mutually disjoint subsets, namely .in, This represents a set of student entity nodes, where each node uniquely corresponds to a student currently enrolled. Represents a set of physical space nodes, including geographical location markers for classrooms, dormitories, canteens, library reading rooms, and laboratories; This represents a set of event attribute nodes, including course identifiers, club activity identifiers, and consumer terminal type identifiers. Hypergraph Module 501 assigns a unique global index number to each node.
[0041] The Hypergraph construction module 501 discretizes the continuous time axis. The time window length is set to... The observation period is divided into an ordered time slice sequence. In each time slice Within this module, the hypergraph construction module 501 extracts all interaction records that occurred during this time period and constructs the corresponding static hypergraph snapshot. .in Indicates the first The set of superedges within a time slice.
[0042] The hypergraph construction module 501 generates hyperedges based on the principle of multi-agent co-occurrence. For any hyperedge... This represents a specific aggregated event. When multiple students are in a time slice... Within a graph, when students simultaneously appear in a specific location or participate in a specific activity, the hypergraph construction module 501 combines these student nodes, their corresponding location nodes, and attribute nodes to generate a hyperedge. Formalized, hyperedge ,in .
[0043] Hypergraph building module 501 uses the association matrix The generated static hypergraph snapshot is stored and represented. (Incident matrix) The dimension is Elements of the correlation matrix The assignment rules are as follows: if the node Contained in hyperedge In the middle, then 1; otherwise .
[0044] The hypergraph construction module 501 further calculates the hypergraph degree of each node for subsequent weight analysis. degree The calculation formula is: ; This degree reflects the frequency of a node's event participation within the current time slice. (Superedge) degree The calculation formula is: ; This degree reflects the scale of the entities involved in the event. The hypergraph construction module 501 stores the completed dynamic hypergraph sequence and its corresponding association matrix into the hypergraph structure database unit 402.
[0045] This invention provides a method for constructing a behavioral semantic impedance network, which is executed by an impedance calculation module 502 and may specifically include: Impedance calculation module 502 reads the time slice sequence and corresponding correlation matrix data from the hypergraph structure database unit 402. Impedance calculation module 502 maps the discrete dynamic hypergraph sequence into a continuous third-order tensor structure. Tensor The dimension is defined as ,in The total number of nodes. This represents the total number of time slices. (Tensor) elements in Represents a node With nodes In the Co-occurrence correlation strength within a time slice. If node and If they all appear in the same superedge during this time period, then the impedance calculation module 502 will calculate the impedance based on the weight of the superedge. Perform the assignment; otherwise The value is assigned to zero.
[0046] Impedance calculation module 502 constructs a third-order tensor Perform nonnegative Tucker decomposition. Impedance calculation module 502 updates the original third-order tensor using an iterative update algorithm. Decomposed into core tensor Node factor matrix and time factor matrix Modular product form. Its operational relationship is expressed as: ; Among them, the core tensor Characterized stable structural motifs of group behavior in latent space; node factor matrix The time factor matrix represents the membership degree of each node in the potential behavioral pattern. It represents the trend of behavioral patterns evolving over time.
[0047] The impedance calculation module 502 calculates the semantic consistency score for any given hyperedge based on the latent space features obtained from the decomposition. For the current time slice... Hyperedge observed inside The impedance calculation module 502 extracts the factor vectors of the nodes contained in the hyperedge and, combined with the time factor vector at the current moment, calculates its relationship with the core tensor. The degree of matching is used to obtain a semantic consistency score. The calculation formula is: ; in, For nodes In the factor matrix The corresponding element in Time slice In the factor matrix The corresponding element in.
[0048] Impedance calculation module 502 calculates the semantic impedance value of the hyperedge based on the semantic consistency score. The impedance calculation module 502 defines impedance using an inverse proportional function, ensuring that common high-frequency co-occurring modes correspond to lower impedance values, while sparse or conflicting modes correspond to higher impedance values. The impedance calculation formula is: ; in, The smoothing coefficient is non-zero. The impedance calculation module 502 traverses all hyperedges within the current time slice, calculates the semantic impedance value of each hyperedge, and writes the calculation results as the attribute weights of the hyperedges into the hypergraph structure database unit 402, thus completing the construction of the behavioral semantic impedance network.
[0049] This invention provides a method for analyzing the energy dissipation of behavioral flows. This method is executed by a dissipation analysis module 503 and may specifically include: The dissipation analysis module 503 analyzes each student node to be tested. In the specified time slice Built-in behavior strength vector The dimension of this vector is related to the total number of hyperedges within the current time slice. Consistent. Dissipation analysis module 503 traverses student nodes. Record all events the student participated in to determine their activity level on each superedge. If the student node... Contained in hyperedge In the process, the dissipation analysis module 503 generates a vector based on the duration or frequency of the student's participation in the event. The corresponding element Assign a value; if a student does not participate in a certain superedge, the corresponding element is assigned a value of zero.
[0050] The dissipation analysis module 503 reads the semantic impedance values of each hyperedge from the hypergraph structure database unit 402. The dissipation analysis module 503 calculates student nodes based on the principle of energy transfer in physical fields. Instantaneous energy dissipation in the current hypergraph network This calculation process combines the behavior strength vector with the semantic impedance attribute, and the specific calculation formula is as follows: ; According to the formula, when students' behavior is mainly concentrated on low-impedance conventional hyperedges (such as attending classes according to the timetable), the calculated energy dissipation value is low; when students participate intensively in high-impedance abnormal hyperedges (such as high-frequency interactions during class time in non-teaching areas), the calculated energy dissipation value increases significantly.
[0051] Dissipation analysis module 503 calculates the current time slice The baseline indices for the population are determined. The dissipation analysis module 503 calculates the instantaneous energy dissipation values of all active student nodes within the current time slice and then calculates their population mean. and standard deviation This step aims to quantify the overall behavioral activity and dispersion within the current environmental context, thereby eliminating the impact of school-wide collective activities or sudden public events on single-dimensional values.
[0052] The dissipation analysis module 503 uses population baseline indices to standardize the instantaneous energy dissipation value of a single individual, generating a relative dissipation index. The calculation formula is as follows: ; The dissipation analysis module 503 calculates the relative dissipation index. As a feature value characterizing the degree of overt behavioral abnormality, it is associated and stored in the current state record of the student node for subsequent use by the decision module 505. This relative dissipation index can reflect the degree to which an individual's behavior deviates from the current group norms, without being affected by overall environmental fluctuations.
[0053] This invention provides a method for detecting latent structural anomalies, which is executed by a sensitivity detection module 504 and may specifically include: Sensitivity detection module 504 first bases its detection on the current time slice. Hypergraph structure The stationary distribution probability of each node in the hypergraph network is calculated. The sensitivity detection module 504 determines the nodes based on the hypergraph random walk theory. stationary distribution probability Weighted degree of the node They show a positive correlation. Specifically, the sensitivity detection module 504 calculates the sum of the degrees of all nodes in the current hypergraph and assigns a single node... Dividing the degree by the sum yields the normalized stationary probability. The calculation formula is as follows: ; Sensitivity detection module 504 calculates the global structural entropy of the current hypergraph system using the definition of information entropy. This metric is used to quantify the complexity and disorder of the current community network structure. The sensitivity detection module 504 iterates through all nodes in the network, calculating the negative of the log-weighted sum of the stationary distribution probabilities. The calculation formula is as follows: ; Sensitivity detection module 504 targets the student node to be detected. Perform a counterfactual perturbation operation. Sensitivity detection module 504 constructs a virtual perturbation hypergraph copy in computational memory. In this perturbation hypergraph replica, the sensitivity detection module 504 performs a structural pruning operation, removing student nodes. Remove it from all the hyperedge associations it participates in, thereby simulating a counterfactual scenario where the student node is invalid or absent in the current social network structure.
[0054] Sensitivity detection module 504 based on perturbation hypergraph replica The stationary distribution probability of the remaining nodes in the network is recalculated, and the global structural entropy after perturbation is calculated accordingly. The recalculation process follows the aforementioned stationary distribution formula and structural entropy formula, with the input data being the perturbation-processed node degree distribution and network topology.
[0055] Sensitivity detection module 504 calculates student nodes Structural sensitivity index This indicator is defined as the absolute value of the change in global structural entropy before and after a counterfactual perturbation. The calculation formula is as follows: ; The sensitivity detection module 504 will calculate the structural sensitivity index. Output to decision module 505. This indicator value reflects the student node... The degree to which a node can withstand the stability of the local network structure. In a dense community structure, the removal of a single node is buffered by the redundant connections of its neighbors, resulting in a small change in entropy; in a socially isolated or fragile connection state, the removal of a single node causes a significant change in the local topology, resulting in a large fluctuation in global entropy.
[0056] This invention provides a method for determining dual heterogeneity anomalies, which is executed by decision module 505 and may specifically include: Decision module 505 receives the relative dissipation index from dissipation analysis module 503. and structural sensitivity indicators from sensitivity detection module 504 The decision module 505 first performs numerical normalization on the two sets of input data, mapping indices with different physical dimensions to the same numerical interval [0,1]. The decision module 505 uses the max-min normalization algorithm to calculate the normalized value of the relative dissipation index within the current time slice. Normalized value of structural sensitivity index .
[0057] Decision module 505 constructs a comprehensive anomaly scoring function. The normalized indicators are then weighted and fused. Decision module 505 introduces an indicator function mechanism to filter out low-amplitude structural noise, only removing it when the normalized structural sensitivity index exceeds a preset baseline noise threshold. Only when an anomaly occurs will it be included in the calculation of the comprehensive score. The formula for calculating the comprehensive anomaly score is as follows: ; in, and These are the weighting coefficients for explicit behavioral features and implicit structural features, respectively. This is a logical indicator function that takes the value 1 when the condition is met, and 0 otherwise.
[0058] Decision module 505 is based on comprehensive anomaly scoring The system executes classification and judgment logic based on the characteristics of each indicator. Decision module 505 presets explicit anomaly detection thresholds. Threshold for latent anomaly detection Decision module 505 first determines the relative dissipation index. Is it greater than the threshold for determining obvious abnormalities? If the judgment result is true, the decision module 505 generates an explicit violation anomaly marker. This marker corresponds to the student's high-intensity activity on the high-impedance semantic path, indicating the existence of explicit behavior that violates group behavioral norms, such as skipping class, failing to return to the dormitory, or using high-powered electrical appliances.
[0059] Decision module 505 further determines the structural sensitivity index. Is it greater than the threshold for latent anomaly detection? And relative dissipation index The result is within the normal range. If the judgment result is true, the decision module 505 generates a latent structural anomaly marker. This marker corresponds to the high sensitivity shown by the student node in the counterfactual perturbation test, indicating that although the student has not shown any violations in the explicit behavioral data, he / she is in an isolated, disconnected or fragile state in the social topology, and there is a potential risk of psychological isolation or marginalization.
[0060] The decision module 505 packages the anomaly type marker generated by the judgment, the corresponding anomaly confidence value, and the time window identifier that triggered the anomaly into an anomaly event record. The decision module 505 writes this anomaly event record into the system's early warning log database and sends a tiered response command to the management terminal through the system interface. For explicit violations, the system triggers a behavioral intervention process; for implicit structural anomalies, the system triggers a counselor attention or mental health screening process.
[0061] This invention provides a system for identifying abnormal behavior in student communities. The system runs on a server or computing cluster and may include: a hypergraph construction module 501, an impedance calculation module 502, a dissipation analysis module 503, a sensitivity detection module 504, and a decision module 505.
[0062] The hypergraph construction module 501 is configured to receive time-series data streams from the data preprocessing layer. The hypergraph construction module 501 executes hyperedge generation logic, mapping student entities, location entities, and attribute entities within the same time window to node sets according to preset time window parameters. The hypergraph construction module 501 identifies co-occurrence relationships between multiple entities, generates a hyperedge structure containing multiple nodes, and constructs the corresponding hypergraph association matrix. The hypergraph construction module 501 outputs the generated dynamic hypergraph sequence data to the impedance calculation module 502 and the dissipation analysis module 503.
[0063] Impedance calculation module 502 is connected to hypergraph construction module 501 and configured to calculate the physical impedance properties of each connection path in the hypergraph network. Impedance calculation module 502 includes a tensor operation unit for converting the input dynamic hypergraph sequence into a higher-order tensor structure and performing non-negative Tucker decomposition to extract core behavioral motifs. Impedance calculation module 502 calculates the semantic matching degree between the currently observed hyperedge and the core behavioral motif, and generates the semantic impedance value of each hyperedge based on the reciprocal of the matching degree. Impedance calculation module 502 transmits the calculated semantic impedance data to dissipation analysis module 503.
[0064] The dissipation analysis module 503 is connected to the hypergraph construction module 501 and the impedance calculation module 502, and is configured to quantify the energy loss of individual student behaviors in the hypergraph network. The dissipation analysis module 503 extracts the behavioral characteristics of the student under test within the current time window and constructs a behavior intensity vector. Combining the semantic impedance value and the behavior intensity vector, the dissipation analysis module 503 calculates the instantaneous energy dissipation value of each student. The dissipation analysis module 503 further statistically analyzes the dissipation distribution of all active nodes within the current time window, calculates the group average dissipation baseline, and generates a normalized relative dissipation index accordingly. The dissipation analysis module 503 outputs the relative dissipation index to the decision module 505.
[0065] Sensitivity detection module 504 is connected to hypergraph construction module 501 and configured to detect latent network structure anomalies. Sensitivity detection module 504 calculates the stationary distribution probability of all network nodes and the global structural entropy of the system based on the hypergraph association matrix. Sensitivity detection module 504 executes a counterfactual inference procedure to construct a virtual perturbed hypergraph replica in the computation space after removing the student node to be detected. Sensitivity detection module 504 calculates the global structural entropy under the perturbed state and compares the entropy changes before and after the perturbed state to generate a structural sensitivity index. Sensitivity detection module 504 outputs the structural sensitivity index to decision module 505.
[0066] The decision module 505 is connected to the dissipation analysis module 503 and the sensitivity detection module 504, and is configured to generate the final anomaly identification result. The decision module 505 receives the relative dissipation index and the structural sensitivity index, performs numerical normalization and weighted fusion operations, and obtains a comprehensive anomaly score. The decision module 505 has built-in classification and decision logic, which determines the anomaly type based on the comparison results of the comprehensive score and sub-indicators with preset thresholds. The decision module 505 distinguishes between explicit behavioral violations and implicit structural anomalies, and outputs a detection report containing anomaly type labels and confidence scores.
Claims
1. A method for identifying abnormal behavior in student communities based on big data, characterized in that, Includes the following steps: Multi-source heterogeneous data are sliced according to a preset time window, and a dynamic spatiotemporal hypergraph sequence containing nodes and hyperedges is constructed based on entity co-occurrence relationships. The dynamic spatiotemporal hypergraph sequence is mapped to a higher-order tensor and non-negative Tucker decomposition is performed to extract the core tensor. The semantic impedance value is determined based on the semantic matching degree between the hyperedge and the core tensor. Construct the behavior intensity vector of the student node to be detected, calculate the instantaneous energy dissipation value in combination with the semantic impedance value, and standardize the instantaneous energy dissipation value using the group baseline index to obtain the relative dissipation index; The global structural entropy is calculated based on the node degree distribution. A perturbation hypergraph replica is constructed by virtually removing the student node to be detected. The change in the global structural entropy before and after the removal is calculated to generate a structural sensitivity index. The relative dissipation index and the structural sensitivity index are normalized and weighted to obtain a comprehensive anomaly score. Based on the comprehensive anomaly score and the sub-indicators, explicit violation anomalies or implicit structural anomalies are determined.
2. The method for identifying abnormal behavior in student communities based on big data according to claim 1, characterized in that, The construction of the dynamic spatiotemporal hypergraph sequence containing nodes and hyperedges specifically includes: Establish a global node index table, and define the node set as consisting of a student entity node set, a physical space node set, and an event attribute node set; For each time slice, extract all interaction records within the time slice. When multiple students appear at the same location or participate in the same activity, combine the associated student nodes, location nodes, and attribute nodes to generate a super edge. Construct an association matrix, where the element values in the association matrix represent the inclusion relationship between the hyperedge and the node. If a node is included in the hyperedge, the corresponding position in the association matrix is assigned a value of 1; otherwise, it is assigned a value of 0. Based on the node set in each time slice, the generated hyperedge, and the association matrix, the dynamic spatiotemporal hypergraph sequence is obtained.
3. The method for identifying abnormal behavior in student communities based on big data according to claim 1, characterized in that, The determination of semantic impedance value based on the semantic matching degree between the hyperedge and the core tensor specifically includes: The dynamic spatiotemporal hypergraph sequence is mapped to a third-order tensor, where the elements represent the co-occurrence association strength between nodes; The third-order tensor is decomposed into a modular product of the core tensor, the node factor matrix, and the time factor matrix. Extract the factor vectors of the nodes contained in the currently observed hyperedge and the time factor vector at the current moment, and calculate the degree of matching between the factor vectors and the core tensor to obtain the semantic consistency score; The semantic consistency score is calculated by using an inverse proportional function to obtain the semantic impedance value.
4. The method for identifying abnormal behavior in student communities based on big data according to claim 1, characterized in that, The calculation of instantaneous energy dissipation based on the semantic impedance value specifically includes: Iterate through all event records of the student node to be tested, and assign values to the behavior intensity vector based on the duration of participation or the frequency of interaction. Calculate the product of the square of each element in the behavior intensity vector and the semantic impedance value of the corresponding hyperedge of the element, and sum all the products to obtain the instantaneous energy dissipation value.
5. The method for identifying abnormal behavior in student communities based on big data according to claim 4, characterized in that, The standardization of the instantaneous energy dissipation value using population baseline indicators specifically includes: The instantaneous energy dissipation values of all active student nodes within the current time slice are statistically analyzed, and the population mean and standard deviation are calculated as the population baseline indicators. The relative dissipation index is obtained by subtracting the population mean from the instantaneous energy dissipation value of the student node to be detected and then dividing by the standard deviation.
6. The method for identifying abnormal behavior in student communities based on big data according to claim 1, characterized in that, The calculation of global structural entropy based on node degree distribution specifically includes: The sum of the degrees of all nodes in the current hypergraph network is calculated, and the degree of a single node is divided by the sum of the degrees to obtain the normalized stationary distribution probability of the node. The negative value of the logarithmic weighted sum of the stationary distribution probabilities is used to obtain the global structural entropy of the current hypergraph system.
7. The method for identifying abnormal behavior in student communities based on big data according to claim 6, characterized in that, The calculation of the change in global structural entropy before and after removal to generate a structural sensitivity index specifically includes: Construct a virtual perturbation hypergraph replica, and delete the student node to be detected and all hyperedge connections in which the student node to be detected participates in the perturbation hypergraph replica; The stationary distribution probability of the remaining nodes and the perturbed global structural entropy are recalculated based on the perturbed hypergraph replica; The absolute value of the difference between the global structural entropy before and after the perturbation is calculated and used as the structural sensitivity index.
8. The method for identifying abnormal behavior in student communities based on big data according to claim 1, characterized in that, The comprehensive anomaly score is obtained by normalizing and weighting the relative dissipation index and the structural sensitivity index, specifically including: Max-min normalization was applied to the relative dissipation index and the structural sensitivity index, respectively. A basic noise threshold is set, and the structural sensitivity index is included in the weighted calculation only when the normalized structural sensitivity index exceeds the basic noise threshold. The comprehensive anomaly score is obtained by weighting and summing the normalized indicators using the weight coefficients of explicit behavioral features and implicit structural features.
9. The method for identifying abnormal behavior in student communities based on big data according to claim 8, characterized in that, The determination of explicit violations and anomalies specifically includes: Determine whether the relative dissipation index is greater than a preset explicit anomaly determination threshold; If the judgment result is true, an explicit violation anomaly marker is generated, and a corresponding behavioral intervention response instruction is generated.
10. The method for identifying abnormal behavior in student communities based on big data according to claim 8, characterized in that, The determination of latent structural anomalies specifically includes: Determine whether the structural sensitivity index is greater than a preset latent anomaly detection threshold, and whether the relative dissipation index is within the normal range; If the judgment result is true, a latent structural anomaly marker is generated, and a corresponding counselor attention or investigation instruction is generated.