A method of game level design and creation
Through the level editing interface and data analysis model, cross-platform game level design and automatic creation have been achieved, solving the problems of long time consumption and poor adaptability of traditional design methods, and improving design efficiency and predictive ability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING WEIEN INFINITY INFORMATION TECHNOLOGY CO LTD
- Filing Date
- 2025-04-17
- Publication Date
- 2026-06-23
AI Technical Summary
Traditional game level design and testing methods are time-consuming and inefficient. The difficulty depends on the experience of the designers and testers, which cannot adapt to the players' skill level. In addition, level editors are usually only usable for one game and are difficult to cope with changes in requirements.
It provides an interactive interface for level editing, allowing designers to create, modify, and synchronize levels via a computer backend. The terminal renders and optimizes the layout in real time, obtains settlement information to analyze level difficulty, and automatically generates new levels, including maps, obstacle layouts, and item distributions of specified difficulty. Combined with player data analysis models, it supports cross-platform editing and automatic creation.
It enables timely saving and synchronization of level data, allowing designers to automatically generate new levels by specifying difficulty, thus improving design efficiency. It can also predict purchasing behavior based on player data, providing significant predictive capabilities and optimizing resource allocation and player retention rates.
Smart Images

Figure CN120132364B_ABST
Abstract
Description
Technical Field
[0001] This invention pertains to mobile game level editors, specifically to casual, puzzle, and elimination-type level-based games. Background Technology
[0002] Currently, mobile game developers typically have R&D staff develop PC-based design and editing tools, while game designers manually design and edit level content. The edited level data is then packaged into the mobile game, where designers and testers conduct testing. After testing, adjustments are made to the levels based on difficulty and resource consumption. When new items or obstacles are introduced, the development team needs to modify the tools accordingly.
[0003] Traditional game level design and testing methods are time-consuming and inefficient. Furthermore, level difficulty depends heavily on the experience of designers and testers, failing to adequately adapt to player skill levels. Thirdly, they are highly dependent on the cooperation of development staff when requirements change. Finally, most level editors are only compatible with a single game. Therefore, establishing a method for editing, testing, and automatically creating levels, and / or predicting player purchasing behavior, is of practical significance. Summary of the Invention
[0004] This invention aims to at least partially solve one of the technical problems existing in the prior art. To this end, this invention provides a method for designing and creating game levels, characterized by the following steps: providing a level editing interface, enabling planners to create, modify, save, and synchronize levels to a terminal for testing and preview via a computer backend; rendering the edited level in real time on the terminal, allowing planners to experience the level design effect from the perspective of a game player and optimize and adjust the level layout; executing the action of completing the level from the player's perspective on the terminal; obtaining the settlement information after completing the level and synchronously transmitting the settlement information between the terminal and the backend; storing the settlement information in a backend data pool for analyzing level difficulty, obstacle effects, and item consumption, and correcting and optimizing the level data model; automatically generating a new level based on the settlement information, the new level including a level template with a map, obstacle layout, and item distribution matching a specified difficulty.
[0005] Preferably, automatically generating new levels based on the settlement information includes: calculating the difficulty curve of each level based on the settlement information, wherein the difficulty curve parameters include at least the level difficulty, user churn rate, and potential item consumption by users; and automatically generating new levels based on the difficulty curve.
[0006] Preferably, the calculation of level difficulty includes: calculating the pass count for each level based on the user's level settlement information; the calculation method for the pass count includes: calculating the average number of games played to complete each level based on the user's level settlement information; calculating the average number of items used by the user who completed the level based on the user's level settlement information; and calculating the number of games required to complete the level and the item consumption amount that affects the difficulty based on the user's actual data.
[0007] Preferably, the calculation of the user's potential item consumption includes: extracting multiple behavioral features from the settlement information, the behavioral features including the user's game activity status, item purchase behavior, game duration, number of sessions, number of completed levels, and acquisition and spending of in-game currency; discretizing the behavioral features, using a clustering algorithm to group the values of each behavioral feature to form a predetermined discrete interval, so as to generate the representative range of the feature; detecting and removing outliers in the behavioral features to reduce the impact of extreme values on the discretization process; mapping the discretized behavioral features to a finite set of morphemes, where each morpheme represents a behavioral state, including: (i) an empty morpheme representing a player's inactivity; (ii) morphemes representing features within different intervals of the normal range; (iii) Special morphemes are used to mark extreme behavioral states; by arranging the morphemes generated from the discretized behavioral features in chronological order, behavioral sequence morphemes are formed to construct a sequence representation of the player's behavioral history; the similarity between each morpheme and other morphemes in the sequence is calculated based on a multi-head attention mechanism to capture short-term and long-term dependencies in the player's historical behavioral patterns; the morphemes in the sequence are processed in parallel using a self-attention mechanism to generate a context vector containing overall information about the player's history; the context vector is used to classify and predict the user's item consumption, and the calculation results are output.
[0008] Preferably, the mean clustering algorithm is used to group the values of each behavioral feature to form a predetermined discrete interval, thereby generating a representative range for that feature.
[0009] Compared to existing technologies, this invention offers the following advantages: it can interface with game operation and management systems to collect user experience data during actual gameplay; it establishes a level difficulty analysis model, objectively scoring and statistically analyzing level difficulty based on collected data; the PC version supports level editing, element creation, and modification; the data processing and storage system ensures that edited level data can be saved promptly and synchronized between the PC and mobile versions; the mobile version allows for real-time rendering and experience of edited levels, enabling adjustments to obstacles and items, as well as saving and uploading of modifications; game designers can specify level difficulty, and new levels are automatically generated from level models for further experience and modification. Furthermore, the method provided by this invention allows for a more comprehensive observation of player history, providing significant predictive capabilities. Attached Figure Description
[0010] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this application, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:
[0011] Figure 1 A method for designing and creating game levels according to an embodiment of the present invention is shown. Detailed Implementation
[0012] According to an embodiment of the present invention, a method for designing and creating game levels is provided, comprising the following steps: providing a level editing interactive interface, enabling planners to create, modify, save, and synchronize levels to a terminal for testing and preview via a computer backend; rendering the edited level in real time on the terminal, allowing planners to experience the level design effect from the perspective of a game player, and to optimize and adjust the level layout; performing the action of completing the level from the player's perspective on the terminal; obtaining the settlement information after completing the level, and synchronously transmitting the settlement information between the terminal and the backend, wherein the settlement information is stored in a backend data pool for analyzing level difficulty, obstacle effects, and item consumption, and correcting and optimizing the level data model; automatically generating a new level based on the settlement information, wherein the new level includes a level template with a map, obstacle layout, and item distribution matching a specified difficulty.
[0013] The game level design and creation editor of this invention includes a level editing interface, a level data processing module, a real-time rendering module, a data storage module, and an automatic level generation module based on player data analysis. Through the collaboration of these modules, cross-platform editing, testing, and automatic level creation are achieved.
[0014] The level editing interface is a graphical interface for PC, designed for game designers to easily create complex game scenes. This interface supports creating new levels and manually editing them using drag-and-drop, scaling, and rotation; it also supports reusing existing levels for modification; saving level drafts; and clearing edited content. Additionally, the real-time rendering and testing module renders edited levels in real-time on mobile devices, allowing designers to experience the design effects from a game perspective and easily adjust and optimize level layouts.
[0015] The level data processing module supports offline real-time local saving of game level information data and supports online uploading to cloud server backup; it can synchronize level data between PC and mobile devices, and save modification and commit records; it supports permission-based and version control branches to avoid conflicts and confusion.
[0016] The level generation module based on user data analysis includes the following steps:
[0017] Step 1: Obtain the settlement information for each level for the user.
[0018] User level settlement information includes: level ID, whether the level was passed, number of attempts, score, number of items used, etc.
[0019] The specific process is as follows: each time a user finishes a level, a level settlement will be performed, and the completion status of the current level will be uploaded to the cloud server. The server will then match and store the level settlement information with the user ID.
[0020] For example, the data structure is: (user ID, level ID, whether the level was completed, number of attempts, score, item ID_number of items used). At the end of each game level, the user's performance is calculated. For example, (10001,25,1,3,3650,item001_0,item002_3,item003_0,……) indicates that user ID 10001 successfully completed level 25 on their third attempt, scoring 3650 points and using 3 items (item002). The user ID shows the user's performance on each level, while the level ID indicates the difficulty level experienced by different users on that level.
[0021] Step 2: Calculate the pass / fail coefficient for each level based on the user's level settlement information.
[0022] For casual puzzle games, level difficulty is crucial, impacting the user's gaming pace and experience, and ultimately determining user retention and spending. Therefore, level design begins with establishing a completion coefficient for each level, serving as the basis for assessing level difficulty. Simply put, the completion coefficient is calculated based on the number of times a user attempts the same level until success, as well as the number of items used and consumed to successfully complete the level.
[0023] The specific method for calculating the general relation coefficient is as follows:
[0024] 2-1: Calculate the average number of times each level is completed based on the user's level settlement information.
[0025] In the user game information data backed up to the cloud server, using the level ID as the primary key, select the settlement information of the same level ID most recently reported by each player, and calculate the number of attempts for that level corresponding to successful / failed completion. Calculate the average number of attempts 'a' for passing that level and the average number of attempts 'b' for failing to pass it.
[0026] For example: Users who pass level 25 play level 25 an average of a=4 times. Users who do not pass level 25 play level 25 an average of b=3 times.
[0027] For each level, the average number of games a user needs to play to complete it can be calculated. The number of games a user needs to play to complete a level reflects, to some extent, the level's difficulty.
[0028] 2-2: Calculate the average number of items used by users who cleared the level based on the user's level settlement information.
[0029] In the user game information data backed up to the cloud server, the level ID is used as the primary key to count the percentage m of users who used items after passing the level, as well as the average number of items used, n.
[0030] For example, among users who have completed level 25, m=37% chose to use items, and the average number of items used was n=2.
[0031] The use of items by users to pass levels will affect the difficulty of the game. At the same time, each user has a different approach and strategy for using items, so the number and proportion of items used are also a key factor.
[0032] 2-3: Establish a level difficulty curve based on the level difficulty.
[0033] In this embodiment, each level calculates the number of playthroughs required to complete it based on actual player data, as well as the percentage of items consumed by players who used items that affected the difficulty. A difficulty curve is established for all levels. Designers and operators can easily see the difficulty distribution for each player. As more game levels are added, user feedback accumulates, the accuracy of the curve gradually improves, and the level evaluation model becomes increasingly accurate.
[0034] 2-4: Establish a correlation between difficulty and churn rate based on user churn data at each level.
[0035] According to game statistics and operational data, you can check how many users have dropped out of the game at each level.
[0036] Calculate the churn rate r for each level, and then establish a curve showing the relationship between the user churn rate r and the level difficulty.
[0037] If game levels are too difficult, users may churn after repeated attempts. Conversely, if most game levels are too easy and lack challenge, this can also lead to user churn. Understanding the correlation between user churn and level difficulty is crucial for adjusting the difficulty and pacing of future game levels.
[0038] Step 3: Generate new levels based on level analysis data.
[0039] Through analysis of extensive user experience data on game levels, the correlation model between level difficulty and gameplay content, obstacles, and items will gradually become more refined and accurate. The editor will support automatically generating new level templates based on a specified difficulty level. The generated levels will include game element configurations matching the specified difficulty, such as level maps, obstacle layouts, and item distribution. Designers and operations personnel can directly experience the newly generated levels and make modifications based on their experience to determine whether to release the level content.
[0040] According to another embodiment of the invention, the revenue of free-to-play games primarily comes from advertising and in-game purchases, and sustained user engagement is directly correlated with higher revenue opportunities. In online games, a small percentage of users contribute the majority of sales, while the majority of non-paying users are typically those who no longer play the game. Therefore, predicting whether players will make purchases in the future is crucial and helps in developing effective business policies for the gaming industry. Developing predictive models of player purchasing behavior enables game developers to implement personalized marketing strategies, optimize resource allocation, and improve player retention. This forward-looking approach not only maximizes revenue but also significantly improves player satisfaction, thereby gaining an advantage in a competitive gaming market.
[0041] In this invention, the calculation of a user's potential item consumption includes: extracting multiple behavioral features from settlement information, including the user's game activity status, item purchase behavior, game duration, number of sessions, number of completed levels, and acquisition and spending of in-game currency; discretizing the behavioral features, including: using a clustering algorithm to group the values of each behavioral feature to form a predetermined discrete interval, thereby generating a representative range for the feature; detecting and removing outliers in the behavioral features to reduce the impact of extreme values on the discretization process; and mapping the discretized behavioral features to a finite set of morphemes, where each morpheme represents a behavioral state, including: (i) an empty morpheme representing an inactive player; (ii) morphemes representing features within different intervals of the normal range; and (iii) Special morphemes are used to mark extreme behavioral states; by arranging the morphemes generated from the discretized behavioral features in chronological order, behavioral sequence morphemes are formed to construct a sequence representation of the player's behavioral history; the similarity between each morpheme and other morphemes in the sequence is calculated based on a multi-head attention mechanism to capture short-term and long-term dependencies in the player's historical behavioral patterns; the morphemes in the sequence are processed in parallel using a self-attention mechanism to generate a context vector containing overall information about the player's history; the context vector is used to classify and predict the user's item consumption, and the calculation results are output.
[0042] The values of each behavioral feature are grouped to form predetermined discrete intervals to generate the representative range of that feature. The specific implementation method is as follows:
[0043] 1. Data Preparation: Input the behavior history matrix for each player, where each row represents the value of a certain feature over all days. Process each feature (row iii) separately, considering only the feature values within active days (i.e., extracting non-zero values from the history matrix), and remove outliers to reduce the impact of extreme values on the clustering results.
[0044] 2. Select the number of clusters: Set the number of clusters K=2 to divide the value of each feature into two main ranges. This division usually corresponds to a low value range and a high value range (or other behavioral categories).
[0045] 3. Algorithm Calculation: Randomly select two initial center points, representing the initial center values of two ranges respectively. Calculate the distance between each data point and the two center points in the following manner:
[0046]
[0047] Here, m represents a data point. Specifically, in behavioral features, m can be the feature value for a given day (e.g., total number of conversations, amount of coins purchased, etc.). If the feature is multidimensional, m represents a multidimensional vector, where the value of the nth dimension is... .
[0048] This represents the centroid of the k-th cluster. The centroid is a multi-dimensional vector, where each dimension represents the "average behavior" of the cluster along that feature dimension. For example, This may represent the cluster center of "low gold coin purchase volume", and its value is the mean of the low value range calculated during the clustering process. This represents the feature value of data point m in the nth dimension. For example, if the behavioral features are "Gold Coin Purchase Amount" and "Gold Coin Spending", then... These represent the specific amount of gold coins purchased or the amount of gold coins spent on that day. This represents the value of the center point of the k-th cluster in the n-th dimension. It is obtained by taking the average of the n-th dimension values of all data points belonging to that cluster, i.e.:
[0049]
[0050] in It is the number of data points in cluster k. Let be the set of all data points in cluster k. Calculate the mean of each cluster and use it as the new centroid. Repeat the steps of "assigning data points to clusters" and "updating centroids" until the centroids no longer change significantly (i.e., convergence is achieved). n represents the dimension (or number of features) of the data points. If the features are multidimensional, distance calculations are performed across all dimensions. If the features are one-dimensional, n=1.
[0051] 4. Determine the feature range: After convergence, the center point of each cluster represents the typical value of that range. The two resulting clusters of clustering define two main ranges of feature values: the smaller cluster represents the lower range (e.g., low-frequency activity, low gold coin usage, etc.), and the larger cluster represents the higher range (e.g., high-frequency activity, high gold coin usage, etc.).
[0052] 5. Map eigenvalues to a discrete range
[0053] For each feature value, map it to the nearest range based on its distance from the two cluster centers:
[0054] If the eigenvalue is close to the center of a smaller cluster, it is marked as "low" range;
[0055] If the eigenvalue is close to the center of a larger cluster, it is marked as "high".
[0056] 6. Output the discretization result
[0057] Each feature is divided into two discrete ranges, which can be used as the discretized input features for subsequent model processing. For example: L4 (total number of conversations on the day): can be divided into two ranges of "low conversation frequency" and "high conversation frequency". L8 (gold coins purchased on the day): can be divided into two ranges of "low purchase volume" and "high purchase volume". Assuming the feature is "gold coin purchase volume", and m is the gold coin purchase volume of a player on a certain day, and are the center points of two clusters: = 50 (center of the low purchase volume cluster) = 200 (center of the high purchase volume cluster). If the gold coin purchase volume of a player on a certain day is m = 120, then: . m is closer to , so it is classified into the "low purchase volume" cluster. By the above method, each data point is assigned to the cluster where the nearest center point is located. Through the above content, it is hoped to divide the continuous values of each feature into two discrete ranges, retaining the information of the main behavior patterns while reducing the complexity of the data. This method is applicable to all behavior features, ensuring that the discretization results of each feature can reflect the main trends of player behavior.
[0058] The specific implementation steps for detecting and removing outliers are as follows: The values of behavior features (such as gold coin purchase volume, number of levels passed, etc.) may contain outliers. The goal is to identify and remove outliers so that the subsequent clustering process is not interfered by extreme values. After sorting the data, calculate the following two key statistics: Q1: The position at the 25th percentile in the data, that is, the upper limit of the smaller 25% of the data. Q3: The position at the 75th percentile in the data, that is, the lower limit of the larger 25% of the data. The difference between Q3 and Q1 represents the range of the middle 50% of the data: .
[0059] Calculate the normal data range (upper and lower limits) according to the IQR. The specific formula is: Lower Limit: ; Upper Bound = Q3 + 1.5 × IQR. Usually, the values of the data outside this range are considered outliers.
[0060] 1.5 × IQR is a statistical empirical value used to balance the boundary between normal values and outliers. If the data point m < Lower Bound or m > Upper Bound, then this point is marked as an outlier. Remove the data marked as outliers from the data set. Return the data set after removing outliers for subsequent steps (such as clustering) to use.
[0061] Suppose the data for a certain feature is: [10,15,20,25,30,35,100] (the data has been sorted). Q1=17.5 (25th percentile); Q3=32.5 (75th percentile). Lower limit: Upper Bound: Q3 + 1.5 × IQR = 32.5 + 1.5 × 15 = 55. In the data, 100 > 55, therefore 100 is marked as an outlier. Removing 100 leaves the data as [10, 15, 20, 25, 30, 35]. Using this method, outliers in features can be effectively detected and removed, ensuring that the mainstream trend of the data distribution does not deviate due to extreme values. This step simplifies the data and reduces interference in the clustering process, providing more reliable input for subsequent analysis.
[0062] Additionally, the behavioral sequence morphemes are input into a Transformer neural network, wherein: (i) the Transformer neural network calculates the similarity between each morpheme and other morphemes in the sequence based on a multi-head attention mechanism to capture short-term and long-term dependencies in the player's historical behavioral patterns; (ii) the Transformer neural network processes the morphemes in the sequence in parallel through a self-attention mechanism to generate a context vector containing overall information about the player's history. Transformer model parameters: Embedding dimension: 512; Number of attention heads: 8; Number of encoder layers: 6; Feedforward network dimension: 2048; Position encoding: Sine absolute position encoding; Dropout rate: 0.1.
[0063] In the above embodiments, the feature selection process for the purchase prediction task focuses on three groups of predictor variables: player engagement, player skill, and willingness to make in-game purchases. Specifically, features L1, L3, and L4 reflect the level of player engagement in the game. Features L5, L6, L7, and L12 and L13 provide insights into various aspects of player skill. Meanwhile, features L2 and L8 through L11 assess the player's overall engagement in the game and their potential monetization-related behavioral patterns.
[0064] L1: Any activity instructions for the day (0 / 1); L2: Any purchase instructions for the day (0 / 1); L3: Total time spent in the game since registration; L4: Total number of sessions for the day; L5: Total number of levels cleared since registration; L6: Percentage of different levels cleared that day out of the total number of levels cleared; L7: Percentage of levels replayed that day, representing a percentage of all levels played that day; L8: Gold coins (in-game currency) purchased by the player that day; L9: Gold coins (in-game currency) received as rewards that day; L10: Gold coins (in-game currency) spent that day; L11: Gold coins (in-game currency) stored in the inventory that day; L12: Number of replays used that day; L13: Number of buff items used that day.
[0065] Using the characteristics of player s mentioned above, the player's history is represented as a real matrix G. s Matrix G s The rows represent characteristics, and the columns represent the period from the registration date t=1 to the last day of the player's history t=T. s The number of consecutive days, of which T s This represents the last day of player s's history. Therefore, G s 13×T s The matrix, h i,t Let Li be the value of the i-th feature on day t, 1≤t≤T s Note that different players have different Ts. s The values are different (i.e., the historical length is different). The goal is to predict whether a purchase will occur within the next k days. If y s t Let L2 be the value of player s's binary feature on day t. The main objective is to find a classification function such that: G s →y s ∈{0,1}, where y s =1 indicates that player s will make a purchase within the next k days. To make G s Matrix G can be used as input to a classification model based on a Transformer neural network. s It is transformed into a continuous value vector and a discrete value sequence.
[0066] Continuous value vector representation of G s Convert to a vector m s =(T s ,f s ,l s 2,…, l s i ,,…, l s 13 ,)
[0067] Where T sf represents the duration of the observed player history. s G represents s The first row represents the average (percentage of days player s played the game), and for 2 ≤ i ≤ 13, vector l s i Represented as
[0068] (min(G s i ,→),Q1(G s i ,→),Q2(G s i ,→),Q3(G s i →),max(G s i ,→))
[0069] Among them G s i →For G s The value of the i-th row (i.e., the Li value in the history of s).
[0070] The basic input unit of a Transformer neural network is a classification value—a word fragment or a complete word. To use this model, G... s The input data must be converted into categorical values. This can be achieved by discretizing G. s This is achieved using the feature values in the dataset. Each player's day (G) s The column (of features) is represented by a sequence of 13 morphemes, each reflecting the value of the corresponding feature for that day. The first two binary features, L1 and L2, are represented by one of two morphemes, indicating whether the corresponding feature is 0 or 1. The remaining features Li (2 < i ≤ 13) are represented by one of four morphemes depending on the feature value:
[0071] T1 indicates that the player's Li value is missing due to inactivity on that day;
[0072] T2 indicates that the value of Li is an outlier;
[0073] T3 and T4 indicate that the value of Li belongs to one of the two characteristic ranges within the characteristic Li domain.
[0074] The feature range of each feature is found independently using a mean algorithm (K=2). This algorithm is applied to the actual feature values of all players over all active days (i.e., the i-th row of all history matrices). Since the mean algorithm is sensitive to outliers, extreme values for each feature Li are processed first. For multidimensional data, K-means typically requires normalization or standardization to achieve efficient clustering, but these preprocessing steps are not necessary for one-dimensional data. The original matrix Li is then transformed into the corresponding labeled history matrix TG. sThe substitution and value tokenization are performed as described above. The history of each player s is represented as a sequence m. s By using matrix TG s It is obtained by concatenating the columns. That is, sequence m s The format is as follows:
[0075] m s =(TG s 1, ,…, TG s t , ,…, TG s Ts )
[0076] Among them TG s t For matrix TG s The column represents the morphemes of player s on day t. Note that m... s It is a length of 13T s The sequence contains no more than 48 morphemes that form a morpheme vocabulary.
[0077] The Transformer model's encoder captures rich contextual relationships between morphemes, resulting in a more comprehensive data representation that reveals short-term and long-term patterns in player history. Since morphemes themselves have no inherent order, they need to be mapped to a Euclidean space where their similarity and distance can be measured while preserving their mutual characteristics. The embedding space is a vector representation space in Euclidean space that captures the latent relationships and patterns between morphemes. Finding the optimal embedding space to describe the morphemes and their associations is accomplished by training a Transformer classification model. The input data for the Transformer model is a discretized sequence of behavioral feature morphemes. If the morpheme sequences are of inconsistent lengths, padding techniques are used to extend them to a uniform length, and appropriate masks are added to ignore the impact of padding values on computation. Mapping the discrete morphemes to a high-dimensional embedding space generates an embedding vector matrix: E = Embedding(T), where T is the input morpheme sequence, E is the embedding representation matrix of the morphemes, and each morpheme corresponds to an embedding vector. To capture the temporal order information of the sequence, positional encoding is added to the embedding matrix. Positional encoding is calculated using sine and cosine functions, providing each morpheme with its relative position in the sequence. Using the embedding matrix E′, a linear transformation is used to generate a query, key, and value vector: W q W k and W v It is a learnable weight matrix. Similarity scores (dot products) are calculated using the query and key vectors, and then scaled.
[0078] The self-attention mechanism computes the representation vector e of each morpheme in the embedding space. k The process involves dot products with other morpheme representation vectors in the sequence. The dot product measures the similarity between two vectors in the embedding space, allowing the model to assess the importance of their relationships by comparing all pairs simultaneously. This process constructs new representation vectors that represent the understanding of the entire morpheme sequence from the perspective of each input morpheme. The encoder output contains the contextual embeddings of all morphemes in the input sequence, so mean pooling is applied to obtain a single embedding vector representing the player's history. This vector is passed through an additional linear layer to output the logits of the buy and not buy categories. The final category is selected based on the maximum logit score. The Transformer encoder architecture is shown in the figure. The model architecture is defined by hyperparameters such as the size of the embedding space, the number of multi-head attention layers in the encoder, the number of attention heads, the size of the hidden layers in the feedforward network, the number of training epochs, and the learning rate. The Transformer-based encoder structure for processing morpheme sequences includes a multi-head attention mechanism, a feedforward network, normalization layers, and residual connections. This architecture is used to extract contextual information from the input morpheme sequence, capturing short-term and long-term dependencies. A morpheme is a discretized and morphemic representation of the original data. Each input morpheme is mapped into a high-dimensional vector space, forming its embedding representation. To preserve the sequence order information of the morphemes, the positional encoding is added to the morpheme embedding. This step enables the model to distinguish morphemes at different positions, capturing the temporal characteristics of the sequence. Multiple attention heads are processed in parallel, each focusing on different relationships in the input sequence to capture contextual information. Residual connections are used to avoid gradient vanishing, and the training process is stabilized through normalization. The contextual representation of each morpheme is further processed to enhance the model's non-linear expressive power. Mean pooling is performed on the contextual embedding vectors of all morphemes to generate a global representation of the entire sequence. This step simplifies the representation of multiple morphemes into a fixed-size vector. The global sequence representation is input into a linear layer, which outputs two values without processing by an activation function (such as Softmax): the categories "buy" and "don't buy".
[0079] Based on the output of the linear layer, the final purchase prediction probability is calculated. The Transformer encoder architecture supports parallel processing of the entire sequence, significantly improving processing efficiency. Multi-head attention captures long-short-term dependencies in the sequence, making the model more expressive in prediction tasks. Through mean pooling and positional encoding, the model can handle morpheme sequences of variable length. Through self-attention and a multi-layer network structure, the model efficiently extracts contextual information from the sequence and generates classification results. The model's design emphasizes its flexibility in handling input morpheme sequences, its ability to capture contextual relationships, and its efficient classification capabilities.
[0080] The performance of the in-game purchase prediction task was evaluated using the methods described above. The model was tested on a test portion of 10 randomly generated training set (80%) - test set (20%) splits. The model was compared using a standard multi-class classification performance metric—the macro-average F1 score. In each iteration, both the training and testing portions were expanded to enhance the model's performance on larger datasets while adapting to different lengths of player histories. The obtained F1 scores demonstrate the superior performance of the model and the self-attention technique across all prediction periods, suggesting that a more comprehensive observation of player histories can provide significant predictive power.
[0081] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method of game level design and creation, characterized by, The method comprises the following steps: providing a level editing interface, so that the planner can create, modify, save and synchronize levels to the terminal for testing and previewing through the computer backend; rendering the edited level in real time on the terminal, so that the planner can experience the level design effect from the perspective of a game player and optimize and adjust the level layout; executing the actions of the level from the perspective of a player on the terminal; obtaining the settlement information after completing the level, and synchronously transmitting the settlement information between the terminal and the backend, wherein the settlement information is stored in the backend data pool; generating a new level based on the settlement information, wherein the new level includes a level template matching the distribution of props of a specified difficulty, and the potential prop consumption of the user is predicted based on the player behavior data, and the prediction result is used to determine the distribution of props in the new level, wherein the prediction includes: Obtain a plurality of behavior characteristics of a player s in a historical period, and construct a corresponding player historical real number matrix G s , wherein the player historical real number matrix G s is a 13xT s matrix, the rows of the matrix correspond to preset behavior characteristics, the columns correspond to consecutive days from a registration date to the last day of the player history, and T s represents the length of the history of the player s; wherein the preset behavior characteristics at least include: whether there is an activity on the day, whether there is a purchase on the day, total game time, total session number on the day, cumulative number of passed levels, proportion of different levels passed on the day, proportion of levels replayed on the day, game currency purchased on the day, game currency obtained as a reward on the day, game currency spent on the day, game currency in inventory on the day, number of replay times used on the day, and number of gain prop times used on the day; Discretize each behavior feature in the player history real number matrix G s , map the continuous feature value to a finite set of morphemes, to obtain a corresponding tokenized history matrix TG s . s ; The labeled historical matrix TG is obtained s The player behavior word sequences m are generated by sequentially connecting the columns. s The player behavior word sequences m are generated by sequentially connecting the columns. s The player behavior word sequences m are generated by sequentially connecting the columns. The player behavior morpheme sequence m s The input Transformer neural network first maps the morpheme sequence into an embedding matrix, combines position encoding to generate an input representation containing time sequence information, and then extracts short-term and long-term dependencies between morphemes through self-attention and multi-head attention mechanisms to obtain a context vector representing the overall information of the player's history. based on the context vector, the prediction result of the potential prop consumption of the user is output through the classification layer.
2. The method of claim 1, wherein, generating a new level based on the settlement information, comprising: calculating the difficulty curve of each level based on the settlement information, wherein the difficulty curve parameters at least include the level difficulty, the user churn situation and the potential prop consumption of the user; generating a new level according to the difficulty curve.
3. The method of claim 2, wherein, The calculation of the level difficulty includes: According to the user level settlement information, the number of relationships of each level is calculated, and the calculation method of the number of relationships includes: According to the data of the user level settlement information, the average number of games for passing each level is calculated; According to the data of the user level settlement information, the average prop use situation of the user passing the level is calculated; According to the actual data of the player, the number of games required for passing and the prop consumption amount of the proportion of using props affecting the difficulty of the user passing the level are calculated.
4. The method of claim 3, wherein, The calculation of the potential prop consumption of the user includes: extracting a plurality of behavior characteristics from the settlement information, wherein the behavior characteristics include the game activity state of the user, the purchase prop behavior, the game time, the session number, the number of completed levels, the game currency obtained and spent; discretizing the behavior characteristics, grouping the values of each behavior characteristic using a clustering algorithm, forming a predetermined discrete interval, and generating a representative range of the characteristic; detecting and removing outliers in the behavior characteristics to reduce the influence of extreme values on the discretization process; mapping the discretized behavior characteristics to a limited set of morphemes, wherein each morpheme represents a behavior state, including: (i) an empty morpheme representing a player who is not active; (ii) a morpheme representing different intervals of a characteristic within a normal range; (iii) a special morpheme marking an extreme behavior state; The behavior sequence morphemes are formed by arranging the morphemes generated by the discretized behavior features in time sequence, and a sequence representation of the player behavior history is constructed; The similarity between each morpheme and other morphemes in the sequence is calculated based on the multi-head attention mechanism to capture short-term and long-term dependencies in the player's historical behavior patterns; The self-attention mechanism is used to process the morphemes in the sequence in parallel to generate a context vector containing the overall information of the player's history; The context vector is used to classify and predict the user's consumption of props, and the calculation result is output.
5. The method of claim 4, wherein, The values of each behavior feature are grouped using the mean clustering algorithm to form predetermined discrete intervals to generate the representative range of the feature.