Chinese character generation methods, devices and computer equipment
By constructing a multi-branch tree and combining it with the target font style to generate Chinese characters, the problem of low efficiency in Chinese character generation in traditional methods is solved, and faster and lower-cost Chinese character generation and recognition model training is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2021-08-17
- Publication Date
- 2026-06-30
AI Technical Summary
Traditional Chinese character generation methods rely on handwriting trajectories, resulting in low efficiency in character generation.
By obtaining the components and shape structure of the Chinese character to be generated, a multi-branch tree is constructed. The multi-branch tree is traversed and encoded using parent-child relationships, and the Chinese character is generated in combination with the target font style.
It improves the efficiency of Chinese character generation, reduces reliance on image and trajectory information, lowers costs, and makes the component distribution in the Chinese character database more balanced, thus enhancing the recognition accuracy of rare character components.
Smart Images

Figure CN115906771B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and in particular to a method, apparatus, computer device, and storage medium for generating Chinese characters. Background Technology
[0002] When computer devices process files containing Chinese characters, they typically need to call a Chinese character recognition model to identify the characters. Before the Chinese character recognition model can recognize characters, it needs to be trained using various Chinese characters as training data. This requires a method that can automatically generate Chinese characters.
[0003] Traditional Chinese character generation methods rely on a large number of handwritten trajectories, and each trajectory needs to include x and y axis offsets, as well as three states: stroke end, character end, and stroke completion, in order to generate a Chinese character. Traditional methods suffer from low efficiency in character generation. Summary of the Invention
[0004] Therefore, it is necessary to provide a Chinese character generation method, apparatus, computer equipment, and storage medium that can improve the efficiency of Chinese character generation in response to the above-mentioned technical problems.
[0005] A method for generating Chinese characters, the method comprising:
[0006] Obtain more than one component and at least one shape structure corresponding to the Chinese character to be generated;
[0007] A multi-branch tree corresponding to the Chinese character to be generated is generated based on the components and the shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure.
[0008] Based on the parent-child relationships between nodes in the multi-way tree, at least one tuple in the multi-way tree is determined; each tuple includes a parent node and at least two child nodes belonging to the parent node.
[0009] Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially and the encoding process is performed based on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation.
[0010] Obtain the target font representation corresponding to the target font style, and generate the target Chinese character based on the Chinese character representation and the target font representation.
[0011] A Chinese character generation device, the device comprising:
[0012] The acquisition module is used to acquire more than one component and at least one shape structure corresponding to the Chinese character to be generated;
[0013] The structure processing module is used to generate a multi-branch tree corresponding to the Chinese character to be generated based on the components and the shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure.
[0014] The structure processing module is further configured to determine at least one tuple in the multi-way tree based on the parent-child relationship between nodes in the multi-way tree; each tuple includes a parent node and at least two child nodes belonging to the parent node;
[0015] The encoding module is used to traverse the tuples sequentially from the leaf node to the root node of the multi-way tree and perform encoding processing based on the traversed tuples until the root node is reached, so as to obtain the Chinese character representation.
[0016] The Chinese character generation module is used to obtain the target font representation corresponding to the target font style, and generate the target Chinese character based on the Chinese character representation and the target font representation.
[0017] In one embodiment, the acquisition module is further configured to acquire the input text corresponding to the Chinese character to be generated, and to annotate the input text with radicals; to split the input text based on the radical annotations to obtain more than one component; and to determine at least one shape structure for forming the input text by the component.
[0018] In one embodiment, the structure processing module is further configured to obtain the hierarchical relationship represented by each of the shape structures in the Chinese character to be generated, and the components corresponding to each shape structure; and to arrange the shape structures and their corresponding components in the order from the outside to the inside according to the hierarchical relationship corresponding to each shape structure to generate a corresponding multi-branch tree.
[0019] In one embodiment, the structure processing module is further configured to use the outermost shape structure as the root node of the multi-branch tree to be generated; if there is a second outermost shape structure, then according to the hierarchical relationship corresponding to each shape structure from the outside to the inside, starting from the second outermost shape structure, the first shape structure corresponding to the current layer is sequentially used as the child node of the second shape structure corresponding to the outermost layer of the current layer; if the shape structure corresponding to each layer has a corresponding component, then the corresponding component is used as the leaf node of the node where the shape structure of the corresponding layer is located; based on the root node, the leaf node, and the intermediate nodes between the root node and the leaf node, a multi-branch tree is generated.
[0020] In one embodiment, the structure processing module is further configured to treat each node in the multi-branch tree corresponding to the shape structure as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, then the leaf node belonging to the target parent node is directly used as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, then a preset symbol node is used as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to the other target parent node; and each target parent node and the target child node belonging to the target parent node form a tuple.
[0021] In one embodiment, the encoding module is further configured to traverse the tuples in the multi-way tree sequentially from the leaf node to the root node; encode the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple; if the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple; obtain the next tuple as the current tuple and continue traversing, return to the step of encoding the content corresponding to each node in the currently traversed tuple and continue execution until the root node is reached; and output the subtree encoding result corresponding to the tuple where the root node is located as the Chinese character representation.
[0022] In one embodiment, the encoding module is further configured to: if the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, then use the subtree encoding result corresponding to the other tuple as a first vector; if the content of the target child node in the currently traversed tuple is a component, then determine that the data type of the content of the target child node is a character, and encode the content of the target child node as a second vector; encode the shape structure corresponding to the target parent node in the currently traversed tuple as a third vector; and determine the subtree encoding result corresponding to the currently traversed tuple based on at least one of the first and second vectors, and the third vector.
[0023] In one embodiment, the encoding module is further configured to store the subtree encoding result corresponding to the currently traversed tuple into a last-in-first-out stack; if the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, the subtree encoding result is obtained from the last-in-first-out stack and used as the content of the target child node of the next tuple.
[0024] In one embodiment, the Chinese character generation module is further configured to filter out the target font representation corresponding to the target font style from at least one candidate font representation; the acquisition module is further configured to acquire handwritten Chinese character images of different font styles; and encode each of the handwritten Chinese character images to obtain candidate font representations corresponding to each font style.
[0025] In one embodiment, the Chinese character generation module is further configured to combine the Chinese character representation and the target font representation to obtain a combined representation; process the combined representation through a feedforward network to obtain an image representation vector; adjust the dimension of the image representation vector to a preset dimension and adjust the size of the image representation vector to a preset size, and generate a target Chinese character of image type based on the adjusted image representation vector.
[0026] In one embodiment, the device further includes a training module; the training module is used to train the Chinese character recognition model using the target Chinese character as training data to obtain a trained Chinese character recognition model; the trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
[0027] In one embodiment, the device further includes a recognition module; the recognition module is also used to display an image recognition interface; upload an image to be recognized containing handwritten Chinese characters through the image recognition interface; perform image recognition processing on the image to be recognized through the trained Chinese character recognition model to obtain and output the Chinese character content included in the image to be recognized.
[0028] In one embodiment, the structure processing module is further configured to traverse each node in the multi-branch tree according to a depth-first and left-subtree-first traversal order, and record the component or shape structure corresponding to the corresponding node according to the traversal order to obtain a sequence structure; wherein, if the multi-branch tree has only one subtree structure, the sequence structure is represented by a subset; if the multi-branch tree includes multiple subtree structures, the sequence structure is represented by a set composed of multiple nested subsets, each subset representing a subtree structure.
[0029] A computer device includes a memory and a processor, the memory storing a computer program, the processor executing the computer program to implement the steps of the method described above.
[0030] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the method described above.
[0031] The aforementioned Chinese character generation method, apparatus, computer equipment, and storage medium acquire more than one component and at least one shape structure corresponding to the Chinese character to be generated, thereby generating a multi-branch tree corresponding to the Chinese character to be generated. The leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. Based on the parent-child relationships between nodes in the multi-branch tree, at least one tuple in the multi-branch tree corresponding to the Chinese character to be generated can be determined. Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially, and encoding processing is performed based on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation. Then, the target font representation corresponding to the target font style is obtained, and the target Chinese character can be generated. It is understandable that components and shape structures are low-cost and easier-to-obtain data, while images and trajectory information are high-cost and difficult-to-obtain data. Based on components and shape structures, dependence on image and trajectory information can be avoided, thus enabling faster and lower-cost generation of Chinese characters, providing a large amount of training data for Chinese character recognition models quickly and cost-effectively; it can also reduce the time cost of collecting Chinese character data for the target font style.
[0032] Furthermore, acquiring the components and structural features of the Chinese character to be generated allows for more accurate and targeted generation of target characters containing those components, thereby increasing the number of characters with those components and making the distribution of characters with those components in the Chinese character database more balanced. Increasing the number of characters with those components also enhances the ability to extract detailed features from characters with those components. For example, target characters with rare components can be generated by strategically combining rare components with other components to create new characters. This results in a larger number of characters with those rare components in the Chinese character database, a more balanced distribution, and provides more characters with those rare components to the Chinese character recognition model, thus improving the accuracy of the model in recognizing those rare components.
[0033] In addition, after generating target Chinese characters with the target font style, it is also possible to obtain the annotations of the target Chinese characters, including components and shape structure, and use the annotated Chinese characters as data for other tasks to augment the data of other tasks. Attached Figure Description
[0034] Figure 1 This is a diagram illustrating the application environment of a Chinese character generation method in one embodiment.
[0035] Figure 2 This is a flowchart illustrating a Chinese character generation method in one embodiment;
[0036] Figure 3 This is a schematic diagram of the shape structure in one embodiment;
[0037] Figure 4This is a schematic diagram of a binary tree in one embodiment;
[0038] Figure 5 This is a schematic diagram illustrating the method of splitting the Chinese characters to be generated in one embodiment;
[0039] Figure 6 This is a schematic diagram of the sequence labeling of Chinese characters to be generated in one embodiment;
[0040] Figure 7 This is a schematic diagram illustrating the generation of tuples via a multi-branch tree in one embodiment;
[0041] Figure 8 This is a schematic diagram illustrating the generation of triplets from the Chinese characters to be generated in one embodiment.
[0042] Figure 9 This is a schematic diagram illustrating the representation of Chinese characters in one embodiment;
[0043] Figure 10 This is a schematic diagram of a recursive tree encoder encoding tuples in one embodiment;
[0044] Figure 11 This is a schematic diagram illustrating the encoding of Chinese character images with different font styles to obtain font representations in one embodiment;
[0045] Figure 12 This is a schematic diagram of a target Chinese character with the target font style generated in one embodiment;
[0046] Figure 13 This is a schematic diagram illustrating the recognition of handwritten Chinese characters in one embodiment;
[0047] Figure 14 This is a schematic diagram of a Chinese character generation method in one embodiment;
[0048] Figure 15 This is a structural block diagram of a Chinese character generation device in one embodiment;
[0049] Figure 16 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0050] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0051] The film and television character classification method provided in this application can be applied to, for example... Figure 1In the application environment shown. Among them, the server 102 communicates with the terminal 104 through a network. The Chinese character generation method in this application can be implemented by the terminal and the server separately, or can be implemented by the cooperation of the terminal and the server. Taking the cooperation of the terminal and the server as an example, the terminal 104 obtains the input text corresponding to the Chinese character to be generated and sends the input text to the server 102. The server 102 obtains more than one component for constructing the Chinese character to be generated, and at least one physical structure existing in the Chinese character to be generated; generates a multi-fork tree corresponding to the Chinese character to be generated based on the component and the physical structure; the leaf nodes of the multi-fork tree correspond to each component, and the non-leaf nodes correspond to each physical structure; based on the parent-child relationship between the nodes in the multi-fork tree, determines at least one multi-tuple in the multi-fork tree; each multi-tuple includes a parent node and at least two child nodes subordinate to the parent node; traverses the multi-tuples in sequence from the leaf nodes to the root nodes of the multi-fork tree and performs encoding processing based on the traversed multi-tuples until reaching the root node, obtaining a Chinese character representation; obtains a target font representation corresponding to the target font style, and based on the Chinese character representation and the target font representation, can generate a target Chinese character with the target font style.
[0052] Among them, the terminal 104 can be but is not limited to a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a vehicle-mounted terminal, a smart TV. The server 102 can be an independent physical server, can also be a server cluster or a distributed system composed of multiple physical servers, or can also be a cloud server providing cloud computing services.
[0053] In one embodiment, as Figure 2 shown, a Chinese character generation method is provided. Taking this method applied to a computer device as an example for illustration, among them, the computer device can specifically be Figure 1 the terminal or the server in
[0054] Step S202, obtain more than one component and at least one physical structure corresponding to the Chinese character to be generated.
[0055] Among them, the Chinese character to be generated can be an actually existing and used Chinese character, or can be a non-existing and unused Chinese character. A component refers to a constituent part of the Chinese character to be generated. A component can be a radical of the Chinese character to be generated, or a constituent part of a radical. Among them, a radical is a component for forming a Chinese character, and a radical is the first stroke or the semantic radical of a Chinese character. It can be understood that a radical belongs to a component, and a component is not necessarily a radical.
[0056] For example, the components of the Chinese character "pin" to be generated are "fen" and "bei", among which, both "fen" and "bei" are radicals, and "bei" is a radical. Another example is that the components of the Chinese character "pin" to be generated are "ba", "dao", "tong" and "ren".
[0057] The physical structure is a combined structure of more than one component. The physical structure can specifically include left - right structure, up - down structure, upper - left surrounded structure, left - three - surrounded structure, upper - three - surrounded structure, lower - three - surrounded structure, upper - right surrounded structure, lower - left surrounded structure, full - surrounded structure, and embedded structure. Chinese characters with left - right structure such as "zheng, wei, xiu, da, ming, sha", up - down structure such as "zhi, miao, zi, wei, sui, jun", upper - left surrounded structure "miao, bing, fang, ni, mei, li", left - three - surrounded structure "qu, ju, za, xia, chen, yi", upper - three - surrounded structure "tong, wen, nao, zhou, feng, gang", lower - three - surrounded structure "ji, xiong, han, hua, you, bin", upper - right surrounded structure "ju, ke, si, shi, rong, shi", lower - left surrounded structure "jian, lian, tan, gan, chao, chi", full - surrounded structure "qiu, tuan, yin, ling, yuan, guo", and embedded structure "zuo, shuang, jia, e, wu, yi".
[0058] The physical structure can also include left - middle - right structure and up - middle - down structure. Chinese characters with left - middle - right structure such as "hu, jiao, jian, xie, zuo, zhou", and up - middle - down structure such as "xi, ji, bing, xie, ying, yan".
[0059] Among them, each physical structure can be represented by a structure identifier. The structure identifier can be a string, a number, a graph, etc. As Figure 3 shown, the left - right structure is represented by "a", the up - down structure is represented by "d", the upper - left surrounded structure is represented by "stl", the left - three - surrounded structure is represented by "sl", the upper - three - surrounded structure is represented by "sb", the lower - three - surrounded structure is represented by "st", the upper - right surrounded structure is represented by "str", the lower - left surrounded structure is represented by "sbl", the full - surrounded structure is represented by "s", and the embedded structure is represented by "w".
[0060] Specifically, if the components of the Chinese character to be generated are more than one, when the components of the Chinese character to be generated are two, there is one physical structure for the Chinese character to be generated; when the components of the Chinese character to be generated are more than two, there are more than one physical structures for the Chinese character to be generated. For example, the components of the Chinese character "pin" to be generated are "fen" and "bei", and this Chinese character to be generated has one physical structure "up - down structure". Another example is that the components of the Chinese character "ni" to be generated are "扌", "shi", and "bi". This Chinese character to be generated has two physical structures, namely "left - right structure" and "upper - left surrounded structure". The upper - left surrounded structure is the composition structure between "shi" and "bi", and the left - right structure is the composition structure between "扌" and "ni" formed by the combination of "shi" and "bi".
[0061] Step S204, generate a multi - fork tree corresponding to the Chinese character to be generated based on the components and the physical structure; the leaf nodes of the multi - fork tree correspond to each component, and the non - leaf nodes correspond to each physical structure.
[0062] Among them, a multi-way tree is a tree-like structure composed of multiple branches. The multi-way tree can be a binary tree, a ternary tree, a quaternary tree, etc., which is not limited in this application. In a multi-way tree, there are root nodes, leaf nodes, and intermediate nodes between the root nodes and the leaf nodes. The leaf nodes correspond to various components, and both the root nodes and the intermediate nodes are non-leaf nodes, corresponding to various physical structures. Among them, the root node is the node in the multi-way tree that has no parent node. The leaf node is the node in the multi-way tree that has no child node.
[0063] In one embodiment, taking the multi-way tree as a binary tree as an example for illustration, as Figure 4 shown is the binary tree corresponding to the Chinese character "满" to be generated. 402 is the root node in the binary tree, corresponding to the physical structure of the left-right structure. 404 is the intermediate node in the binary tree, corresponding to the physical structure of the up-down structure. "氵", "艹", and "两" are all leaf nodes in the binary tree.
[0064] Step S206, based on the parent-child relationship between the nodes in the multi-way tree, determine at least one multi-tuple in the multi-way tree; each multi-tuple includes a parent node and at least two child nodes subordinate to the parent node.
[0065] Among them, a multi-tuple is a combination including multiple elements. The multi-tuple can include at least three elements, including a parent node and at least two child nodes subordinate to the parent node. The multi-tuple can be a triple, a quadruple, a quintuple, etc., which is not limited in this application.
[0066] In one embodiment, the content of the parent node in the multi-tuple can be the physical structure, and at least two child nodes subordinate to the parent node can be components or sub-trees. If the parent node in the multi-tuple is a child node of another multi-tuple, then the parent node is an intermediate node in the multi-way tree; if the parent node in the multi-tuple is not a child node of any multi-tuple, then the parent node is the root node in the multi-way tree. If the child node in the multi-tuple is a component, the child node is a leaf node in the multi-way tree; if the child node in the multi-tuple is a sub-tree, then the child node is an intermediate node in the multi-way tree.
[0067] In one implementation manner, the computer device determines the parent node with child nodes, and forms a multi-tuple with the parent node and at least two child nodes subordinate to the parent node.
[0068] In another implementation manner, the computer device determines the node corresponding to the physical structure in the multi-way tree, takes the node of the physical structure as the parent node, and determines at least two child nodes subordinate to the parent node, and forms a multi-tuple with the parent node and the at least two child nodes.
[0069] Step S208: Following the direction from the leaf node to the root node of the multi-branch tree, traverse the tuples sequentially and encode them based on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation.
[0070] Among them, Chinese character representation refers to the features used to represent Chinese characters. Chinese character representation can include multiple dimensions, such as the number of tuples, the number of strokes, the content of components, and the shape and structure.
[0071] Specifically, the computer device can traverse the multi-branch tree from the leaf nodes to the root node, i.e., in depth-first order. Further, the computer device can also traverse the multi-branch tree from the leaf nodes to the root node, with the left subtree prioritized. In another embodiment, the computer device can also traverse the multi-branch tree from the leaf nodes to the root node, with the right subtree prioritized.
[0072] Specifically, the computer device can traverse each tuple sequentially from the leaf node to the root node of the multi-branch tree, i.e., the depth-first direction of the multi-branch tree. The traversed tuples are input into the encoder for encoding processing, gradually restoring the structural information of the Chinese character to be generated until the root node is reached. That is, the currently traversed tuple is the last tuple among the tuples of the Chinese character to be generated, thus obtaining the Chinese character representation. The encoder can include one or more encoder units. If the encoder includes one encoder unit, this unit is used cyclically during the encoding process; if the encoder includes multiple encoder units, each encoder unit processes one Chinese character to be generated independently during the encoding process. That is, multiple encoder units can process multiple Chinese characters to be generated in parallel, improving the efficiency of Chinese character generation.
[0073] Step S210: Obtain the target font representation corresponding to the target font style, and generate the target Chinese characters based on the Chinese character representation and the target font representation.
[0074] The target font style is the font style used to ultimately generate the target Chinese characters. The target font style can be one of the following: fashionable, cute, or artistic. The target font representation is the feature used to characterize the target font style. It should be noted that the font style of the Chinese characters can be the style of handwritten characters.
[0075] Specifically, the computer device acquires various font styles, determines the target font style from among them, and encodes the target font style to obtain a target font representation corresponding to the target font style. Based on the Chinese character representation and the target font representation, the computer device can integrate the target font style into the Chinese character to be generated, thus generating a target Chinese character with the target font style.
[0076] In another embodiment, the computer device obtains a number identifier representing the style of the target font and calls the target font representation corresponding to that number identifier. For example, the number identifier can be 0, 1, or 2, etc.
[0077] The aforementioned Chinese character generation method acquires more than one component and at least one shape structure corresponding to the Chinese character to be generated, thereby generating a multi-branch tree corresponding to the Chinese character to be generated. The leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. Based on the parent-child relationships between nodes in the multi-branch tree, at least one tuple in the multi-branch tree corresponding to the Chinese character to be generated can be determined. Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially, and encoding processing is performed on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation. Then, the target font representation corresponding to the target font style is obtained, and the target Chinese character can be generated. It is understandable that components and shape structures are low-cost and easier-to-obtain data, while images and trajectory information are high-cost and difficult-to-obtain data. Based on components and shape structures, dependence on image and trajectory information can be avoided, thus enabling faster and lower-cost generation of Chinese characters, providing a large amount of training data for the Chinese character recognition model quickly and cost-effectively; it also reduces the time cost of collecting Chinese character data for the target font style.
[0078] Furthermore, acquiring the components and structural features of the Chinese character to be generated allows for more accurate and targeted generation of target characters containing those components, thereby increasing the number of characters with those components and making the distribution of characters with those components in the Chinese character database more balanced. Increasing the number of characters with those components also enhances the ability to extract detailed features from characters with those components. For example, target characters with rare components can be generated by strategically combining rare components with other components to create new characters. This results in a larger number of characters with those rare components in the Chinese character database, a more balanced distribution, and provides more characters with those rare components to the Chinese character recognition model, thus improving the accuracy of the model in recognizing those rare components.
[0079] In addition, after generating target Chinese characters with the target font style, it is also possible to obtain the annotations of the target Chinese characters, including components and shape structure, and use the annotated Chinese characters as data for other tasks to augment the data of other tasks.
[0080] In another embodiment, the computer device may acquire input text and annotate the input text with radicals; split the input text based on the radical annotations; if the input text is split into a component, generate a Chinese character representation based on the component; acquire a target font representation corresponding to the target font style, and generate a target Chinese character based on the Chinese character representation and the target font representation.
[0081] If the number of components of the Chinese character to be generated is one, indicating that the input text is a single-component Chinese character, the computer device calls the embedding model to encode the component, that is, to encode the input text to obtain the Chinese character representation of the input text, and then obtains the target font representation corresponding to the target font style. Based on the Chinese character representation and the target font representation, the target Chinese character with the target font style corresponding to the single-component Chinese character can be accurately generated.
[0082] In one embodiment, obtaining more than one component and at least one physical structure corresponding to the Chinese character to be generated includes: obtaining the input text corresponding to the Chinese character to be generated, and performing radical annotation on the input text; splitting the input text based on the radical annotation to obtain more than one component; and determining at least one physical structure for forming the input text from the components.
[0083] The computer device performs radical annotation on the input text, which can mark each radical of the input text, and then split the input text according to the radical annotation to obtain more than one component. The computer device determines at least one physical structure for forming the input text from the components according to the splitting order of the input text from the outside to the inside.
[0084] It can be understood that when the computer device splits the input text, it first splits the outermost input text to obtain at least two components, and determines the physical structures of the at least two components; if there are components that can be further split, that is, split the components in the inner layer to obtain at least two components, and determine the physical structures of the at least two components. And so on, until all components are split to the preset minimum granularity, more than one component and at least one physical structure can be obtained.
[0085] The computer device performs radical annotation on the input text based on the preset splitting granularity, and then splits the input text according to the radical annotation to obtain more than one component. It can be understood that different splitting granularities result in different splitting methods for the input text, and different components can be obtained. The splitting granularity refers to the minimum unit for splitting the input text.
[0086] As Figure 5 shown, the input text is "pin". According to different splitting granularities, it can be split into two ways, namely Way 1 of 50 and Way 2 of 504, and the components obtained by each way of splitting are different. In Way 1, the components obtained by splitting the Chinese character "pin" to be generated are "fen" and "bei". In Way 2, the components obtained by splitting the Chinese character "pin" to be generated are "ba", "dao", "tong" and "ren".
[0087] In this embodiment, the input text corresponding to the Chinese character to be generated is obtained, and the input text is marked with radicals; based on the radical marking, the input text is split to obtain more than one component, and at least one physical structure for forming the input text from the components is determined, so that the components and physical structure of the input text can be accurately split based on the radicals of the input text.
[0088] In one embodiment, a multi-way tree corresponding to the Chinese character to be generated is generated based on the components and the physical structure, including: obtaining the hierarchical relationship represented by each physical structure in the Chinese character to be generated, and each component corresponding to each physical structure; arranging the physical structure and the corresponding components in the order from the outside to the inside according to the hierarchical relationship corresponding to each physical structure, and generating the corresponding multi-way tree.
[0089] In one of the embodiments, in the Chinese character to be generated, there is at least one physical structure, and each physical structure represents the combination structure between components at different levels in the Chinese character to be generated. For example, the outermost physical structure of the Chinese character "pin" to be generated is an upper-lower structure, and the components corresponding to this upper-lower structure are "fen" and "bei"; the second-outermost physical structures are an upper-lower structure and an upper-three-sided enclosed structure. The components corresponding to the upper-lower structure are "ba" and "dao" split from "fen", and the components corresponding to the upper-three-sided enclosed structure are "tong" and "ren" split from "bei".
[0090] Further, in the order from the outside to the inside according to the hierarchical relationship corresponding to each physical structure, the computer device arranges the physical structure and the corresponding components. The physical structure serves as the root node and intermediate nodes of the multi-way tree to be generated, and each component corresponding to each physical structure serves as a leaf node, thereby generating the multi-way tree corresponding to the input text.
[0091] In the above embodiment, by obtaining the hierarchical relationship represented by each physical structure in the Chinese character to be generated, and each component corresponding to each physical structure, and arranging the physical structure and the corresponding components in the order from the outside to the inside according to the hierarchical relationship corresponding to each physical structure, the corresponding multi-way tree can be accurately generated.
[0092] In one embodiment, the shapes and their corresponding components are arranged in order from the outside to the inside according to the hierarchical relationship corresponding to each shape structure to generate a corresponding multi-branch tree. This includes: taking the outermost shape structure as the root node of the multi-branch tree to be generated; if there is a second outermost shape structure, then starting from the second outermost shape structure, taking the first shape structure corresponding to the current layer as the child node of the second shape structure corresponding to the outermost layer of the current layer, in order from the outside to the inside according to the hierarchical relationship corresponding to each shape structure; if the shape structure corresponding to each layer has a corresponding component, then taking the corresponding component as the leaf node of the node where the shape structure of the corresponding layer is located; and generating a multi-branch tree based on the root node, leaf nodes, and intermediate nodes between the root node and leaf nodes.
[0093] Specifically, if a shape structure exists in the Chinese character to be generated, then the shape structure is the outermost shape structure, and the shape structure is used as the root node of the multi-branch tree to be generated. The components corresponding to the shape structure are used as leaf nodes to generate the multi-branch tree.
[0094] Here, the first shape structure is the shape structure corresponding to the innermost layer in two adjacent layers. The second shape structure is the shape structure corresponding to the outermost layer in two adjacent layers. Furthermore, in two connected layers, the first shape structure corresponding to the innermost layer is used as a child node of the second shape structure corresponding to the outermost layer, thereby establishing the parent-child relationship between the nodes corresponding to the shape structures.
[0095] If the Chinese character to be generated contains at least two structural forms, the outermost structural form is taken as the root node of the multi-branch tree to be generated. Then, according to the hierarchical relationship corresponding to each structural form from the outside to the inside, starting from the outermost structural form, the first structural form corresponding to the current layer is taken as the child node of the second structural form corresponding to the outermost layer of the current layer. If the structural forms corresponding to each layer have corresponding components, the corresponding components are taken as the leaf nodes of the nodes where the structural forms of the corresponding layers are located. Based on the root node, leaf nodes, and intermediate nodes between the root node and leaf nodes, the multi-branch tree can be accurately generated.
[0096] In the above embodiments, the outermost shape structure is used as the root node of the multi-branch tree to be generated. If there is a second outermost shape structure, then according to the hierarchical relationship corresponding to each shape structure from the outside to the inside, starting from the second outermost shape structure, the first shape structure corresponding to the current layer is used as the child node of the second shape structure corresponding to the outermost layer of the current layer. If the shape structure corresponding to each layer has a corresponding component, then the corresponding component is used as the leaf node of the node where the shape structure of the corresponding layer is located. Based on the root node, leaf nodes, and intermediate nodes between the root node and leaf nodes, the multi-branch tree can be accurately generated.
[0097] In one embodiment, the method further includes a step of recording components and shapes, which specifically includes: traversing each node in the multi-branch tree according to a depth-first and left-subtree-first traversal order, and recording the components or shapes corresponding to the corresponding nodes according to the traversal order to obtain a sequence structure; wherein, if the multi-branch tree has only one subtree structure, the sequence structure is represented by a subset; if the multi-branch tree includes multiple subtree structures, the sequence structure is represented by a set composed of multiple nested subsets, each subset representing a subtree structure.
[0098] Depth-first traversal refers to traversing each branch of a multi-way tree before returning to traverse the remaining branches. Left-subtree first traversal refers to traversing the multi-way tree in the order of left subtree to right subtree.
[0099] Sequence structure refers to the structure in which the components and shapes of the Chinese character to be generated are recorded in sequence. A subset represents a subtree structure in a multi-way tree, corresponding to a tuple.
[0100] Furthermore, the computer device can also store the sequence structure of the Chinese character to be generated; when it is necessary to generate the Chinese character, the stored sequence structure of the Chinese character to be generated is obtained, the sequence structure is converted into a multi-branch tree of the Chinese character to be generated, and the step of determining at least one tuple in the multi-branch tree based on the parent-child relationship between each node in the multi-branch tree is continued.
[0101] It is understandable that sequence structures have smaller data volumes and are easier to store than multi-branch tree structures. Furthermore, sequence structures also record the structural relationships between the components of the Chinese character to be generated. Therefore, storing the Chinese character to be generated in a sequence structure can store the structural relationships between the components, allowing for faster generation of the Chinese character later, while also saving data storage space.
[0102] In the above embodiments, each node in the multi-way tree is traversed in the order of depth-first and left-subtree-first, and the components or physical structures corresponding to the corresponding nodes are recorded in the traversal order, so that the sequence structure of the Chinese characters to be generated can be accurately recorded.
[0103] In one embodiment, as Figure 6 shown, the sequence annotations of "烊", "抳", "揵", and are respectively.
[0104] In another embodiment, the computer device traverses each node in the multi-way tree in the order of depth-first and right-subtree-first, and records the components or physical structures corresponding to the corresponding nodes in the traversal order to obtain a sequence structure.
[0105] In one embodiment, based on the parent-child relationship between the nodes in the multi-way tree, at least one multi-tuple in the multi-way tree is determined, including: each node corresponding to a physical structure in the multi-way tree is respectively used as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, the leaf node subordinate to the target parent node is directly used as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, a preset symbol node is used as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to another target parent node; each target parent node and the target child node subordinate to the target parent node form a multi-tuple.
[0106] Among them, the target node, that is, the target parent node, refers to the node where the physical structure is located. The content of the preset symbol node is a preset symbol, and the preset symbol can be set as needed. For example, the preset symbol is #, ¥, &, *, or!, etc., which is not limited thereto.
[0107] If the child node of the target parent node is a leaf node, it means that the physical structure corresponding to the target parent node is in the innermost layer, and the leaf node subordinate to the target parent node is directly used as the target child node of the corresponding target parent node.
[0108] If the child node of the target parent node is another target parent node, it means that the physical structure corresponding to the target parent node is in the middle layer. The content of the child node of the target parent node is set as the preset symbol to obtain a preset symbol node, that is, the preset symbol node is used as the target child node of the corresponding target parent node.
[0109] In one of the embodiments, as Figure 7As shown, the computer device obtains the target nodes corresponding to the physical structure in the multi-way tree, which are 702 and 704 respectively, and takes each target node as a target parent node. For the target parent node 702, the child nodes are "氵" and 704 respectively. Since "氵" is a leaf node, directly take "氵" as the target child node of the corresponding target parent node. The node 704 is another target parent node, so take the preset symbol node as the target child node of the corresponding target parent node, that is, replace the node 704 with the preset symbol "#" and use it as the target child node, forming a multi-tuple 708. For the target parent node 704, the child nodes are "艹" and "两" respectively. Since both "艹" and "两" are leaf nodes, directly take "艹" and "两" as the target child nodes of the corresponding target parent node, forming a multi-tuple 706.
[0110] In the above embodiment, each node corresponding to the physical structure in the multi-way tree is taken as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, directly take the leaf node subordinate to the target parent node as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, take the preset symbol node as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to another target parent node; each target parent node and the target child nodes subordinate to the target parent node can accurately form a multi-tuple.
[0111] In another embodiment, the computer device obtains the sequence annotation of the Chinese character to be generated stored, determines the target nodes corresponding to the physical structure, and takes each target node as a target parent node; for each target parent node, if the child node corresponding to the physical structure of the target parent node is a component, directly take the component as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, take the preset symbol node as the target child node of the corresponding target parent node; the preset symbol node represents the multi-tuple corresponding to another target parent node; each target parent node and the target child nodes subordinate to the target parent node form a multi-tuple.
[0112] As Figure 8 shown, the computer device generates 1 triple based on the sequence annotation of the Chinese character "岷" to be generated; generates 2 triples based on the sequence annotation of the Chinese character "同" to be generated; generates 3 triples based on the sequence annotation of the Chinese character "蹱" to be generated; generates 4 triples based on the sequence annotation of the Chinese character "遾" to be generated.
[0113] In one embodiment, the Chinese character representation is obtained by sequentially traversing tuples from the leaf nodes to the root node of the multi-way tree and encoding them accordingly until the root node is reached. This includes: sequentially traversing tuples in the multi-way tree from the leaf nodes to the root node; encoding the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple; if the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple; obtaining the next tuple as the current tuple and continuing the traversal, returning to the step of encoding the content corresponding to each node in the currently traversed tuple and continuing until the root node is reached; and outputting the subtree encoding result corresponding to the tuple where the root node is located as the Chinese character representation.
[0114] The subtree encoding result is obtained by encoding the content corresponding to each node in the tuple.
[0115] In one embodiment, if the currently traversed tuple corresponds to the target child node of another tuple, that is, the target child node of the other tuple is a preset symbol node, and the preset symbol node represents the subtree structure of the currently traversed tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple, that is, the subtree encoding result corresponding to the currently traversed tuple is used as the content of the preset symbol node corresponding to the other tuple.
[0116] In the above embodiment, the tuples in the multi-way tree are traversed sequentially from the leaf node to the root node. The content corresponding to each node in the currently traversed tuple is encoded to obtain the subtree encoding result corresponding to the currently traversed tuple. If the currently traversed tuple corresponds to the target child node of another tuple, the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple. The next tuple is obtained as the current tuple and traversal continues. The step of encoding the content corresponding to each node in the currently traversed tuple is returned and executed until the root node is reached. The subtree encoding result corresponding to the tuple can be accurately obtained, thereby outputting the Chinese character representation of the Chinese character to be generated.
[0117] In one embodiment, such as Figure 9As shown in the figure, the computer device obtains a multi-way tree corresponding to the Chinese character to be generated. The multi-way tree includes five components, namely "疋", "亠", "厶", "儿", and "丨", and includes four morphological structures, namely left-right structure, two up-down structures, and full-enclosure structure. Based on the parent-child relationship between the nodes in the multi-way tree, four triples in the multi-way tree are determined, namely 902, 904, 906, and 908; each triple includes a parent node and two child nodes subordinate to the parent node. According to the direction from the leaf node to the root node of the multi-way tree and the traversal order with the left subtree being prioritized, the triple 902 is input into the tree encoder to obtain the corresponding subtree encoding result, and this subtree encoding result is used as the content of the corresponding preset symbol node in the triple 906; then the triple 904 is input into the tree encoder to obtain the corresponding subtree encoding result, and this subtree encoding result is used as the content of the corresponding preset symbol node in the triple 906, so as to obtain the triple 910. The triple 910 is input into the tree encoder to obtain the corresponding subtree encoding result, and this subtree encoding result is used as the content of the corresponding preset symbol node in the triple 908, so as to obtain the triple 912. Then the triple 912 is input into the tree encoder to obtain the corresponding subtree encoding result, and this subtree encoding result is the Chinese character representation of the Chinese character to be generated.
[0118] In one embodiment, encoding the content corresponding to each node in the current traversed multi-tuple to obtain a subtree encoding result corresponding to the current traversed multi-tuple includes: if the target child node in the current traversed multi-tuple corresponds to the target parent node of another multi-tuple, then using the subtree encoding result corresponding to the other multi-tuple as the first vector; if the content of the target child node in the current traversed multi-tuple is a component, determining that the data type of the content of the target child node is a character, and encoding the content of the target child node as the second vector; encoding the morphological structure corresponding to the target parent node in the current traversed multi-tuple into the third vector; determining the subtree encoding result corresponding to the current traversed multi-tuple based on at least one of the first vector and the second vector, and the third vector.
[0119] Among them, the first vector refers to the subtree encoding result of another multi-tuple corresponding to the target child node in the currently traversed multi-tuple, and the data type of this subtree encoding result is a vector. The second vector refers to the vector obtained by encoding the component in the currently traversed multi-tuple. The third vector refers to the vector obtained by encoding the morphological structure in the currently traversed multi-tuple.
[0120] Understandably, each tuple includes a parent node and at least two child nodes belonging to the parent node. The parent node is a geometric structure, and the at least two child nodes can both be components, both be preset symbols, or one part can be a component and the other part can be a preset symbol. Therefore, the target tuple currently being traversed can include at least one of the first and second vectors, as well as a third vector.
[0121] In one embodiment, the computer device determines the subtree encoding result corresponding to the currently traversed tuple based on at least one of the first and second vectors, and a third vector. Taking a tuple as an example, the computer device obtains the currently traversed triple. If the target child node in the triple corresponds to the target parent node of another tuple, the subtree encoding result corresponding to the other triple is used as the first vector. If the content of the target child node in the triple is a component, the data type of the target child node's content is determined to be character, and an embedding model is invoked to encode the content of the target child node into a second vector. The embedding model is then invoked to encode the shape structure corresponding to the target parent node in the triple into a third vector, thus obtaining three vectors, e0, e1, and e2, where e i ∈R H Let i = 0, 1, 2. Here, R is a real number, and H represents the dimension of the vector. Then, a computer device can obtain the triple e = {e0, e1, e2}, e ∈ R. 3H , where 3H represents the dimension of the vector. The second and third vectors are updated along with the embedding model.
[0122] The computer device inputs the triple e into the recursive tree encoder, and the recursive tree encoder encodes the triple using the following formula (1) to obtain the subtree encoding result.
[0123] o = BN(e·W+b), o∈R H , W∈R 3H×H (1)
[0124] Where W is a preset matrix, b is a constant, and BN stands for BatchNorm algorithm. The Batchnorm algorithm is an algorithm used in deep networks to accelerate neural network training, convergence speed, and stability.
[0125] The BN algorithm uses the following formula (2) for calculation:
[0126]
[0127] Where γ and β are learnable parameters, m is the number of x, μ is the mean of each x, σ is the variance of each x, and ∈ is a constant.
[0128] In the above embodiments, if the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, the subtree encoding result corresponding to the other tuple is used as the first vector; if the content of the target child node in the currently traversed tuple is a component, the data type of the content of the target child node is determined to be character, and the content of the target child node is encoded as the second vector; the shape structure corresponding to the target parent node in the currently traversed tuple is encoded as the third vector; encoding components and shape structures as vectors allows components and shape structures to transition from low-information character types to high-information vector types, which can enhance the expressive power of data, thereby more accurately determining the subtree encoding result corresponding to the currently traversed tuple based on at least one of the first and second vectors, as well as the third vector.
[0129] Figure 10 This is a schematic diagram illustrating the encoding of tuples by a recursive tree encoder in one embodiment. The recursive tree encoder 1002 includes a fully connected layer and linear units, and may also include an embedding model. The computer device acquires each node in the currently traversed tuple. If a target child node in a tuple corresponds to a target parent node in another tuple, then the target child node is the subtree encoding result of the other tuple, which is used as a first vector. The data type of the target child node is vector, and the content of the target child node with the data type vector is input to the fully connected layer. If the content of the target child node in a tuple is a component, and the data type of the component is character, the component with the data type character is input to the embedding model, and the component is encoded as a second vector, which is then input to the fully connected layer. The data type of the shape structure corresponding to the target parent node in the tuple is character, and the shape structure with the data type character is encoded as a third vector, which is then input to the fully connected layer.
[0130] By combining fully connected layers and linear units, the subtree encoding result corresponding to the tuple can be output, and then the subtree encoding result is stored in a last-in-first-out stack. The linear unit is processed using the BatchNorm (BN) algorithm.
[0131] In one embodiment, if the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple, including: storing the subtree encoding result corresponding to the currently traversed tuple into a Last-In-First-Out (LIFO) stack; if the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, then the subtree encoding result is obtained from the LIFO stack and used as the content of the target child node of the next tuple.
[0132] A stack is a data structure in which data items are arranged in order. Last-In-First-Out (LIFO) stack means that the data stored in the stack last is retrieved first. For example, if data A1 is stored in the LIFO stack first, then data A2, and then data A3, when retrieving data from the LIFO stack, data A3 is retrieved first, then data A2, and finally data A1.
[0133] The computer device stores the subtree encoding results corresponding to the currently traversed tuple into a Last-In-First-Out (LIFO) stack. If the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, it means that the target child node of the next tuple is a preset symbol node. The preset symbol node represents the subtree structure corresponding to the target parent node in the currently traversed tuple. Then, the subtree encoding result is obtained from the LIFO stack and used as the content of the target child node of the next tuple.
[0134] If the target child node of the next tuple to be traversed is a component, then the component is directly obtained as the content of the target child node.
[0135] In the above embodiments, the subtree encoding result corresponding to the currently traversed tuple is stored in a Last-In-First-Out (LIFO) stack. If the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, the subtree encoding result is obtained from the LIFO stack and used as the content of the target child node of the next tuple. This allows for a more accurate acquisition of the content of the target child node of the tuple, thereby generating Chinese characters more accurately.
[0136] In one embodiment, obtaining a target font representation corresponding to a target font style includes: selecting a target font representation corresponding to the target font style from at least one candidate font representation; wherein, the step of generating candidate font representations includes: obtaining handwritten Chinese character images of different font styles; encoding each handwritten Chinese character image to obtain candidate font representations corresponding to each font style.
[0137] Among them, candidate font representations are candidate font representations used to determine the target font representation.
[0138] Specifically, computer equipment acquires images of handwritten Chinese characters in different font styles, and uses an embedding model to encode each handwritten Chinese character image separately, obtaining candidate font representations corresponding to each font style. The data type of the candidate font representations is vector, and each candidate font representation includes at least one dimension.
[0139] In one implementation, the computer device randomly selects a target font representation corresponding to the target font style from at least one candidate font representation. In another implementation, the computer device selects a handwritten Chinese character image corresponding to the target font style selected by the user from at least one candidate font representation. In other implementations, the computer device may also use other methods to select a target font representation corresponding to the target font style from at least one candidate font representation, which is not limited here.
[0140] In the above embodiments, handwritten Chinese character images with different font styles are obtained; each handwritten Chinese character image is encoded to obtain candidate font representations corresponding to each font style; and then the target font representation corresponding to the target font style is quickly selected from at least one candidate font representation.
[0141] In one embodiment, such as Figure 11 As shown, a computer device acquires images of handwritten Chinese characters with four different font styles, and numbers the fonts as 0, 1, 2 and 3. The font images numbered 0, 1, 2 and 3 are input into a font style embedding model for font style encoding, which yields the font representation corresponding to the font style of each Chinese character image.
[0142] In one embodiment, generating a target Chinese character based on a Chinese character representation and a target font representation includes: combining the Chinese character representation and the target font representation to obtain a combined representation; processing the combined representation through a feedforward network to obtain an image representation vector; adjusting the dimension of the image representation vector to a preset dimension and adjusting the size of the image representation vector to a preset size; and generating an image-type target Chinese character based on the adjusted image representation vector.
[0143] Among them, the combined representation is the feature obtained by combining the Chinese character representation and the target font representation. The image representation vector is a vector used to represent the Chinese character image with the target font style. For example, the image representation vector shape = H*1*1, where H is the dimension of the image representation vector, and 1*1 is the length and width of the image representation vector.
[0144] A feedforward network is a type of unidirectional, multi-layered artificial neural network. Also known as a feedforward neural network, it is a type of artificial neural network where each neuron starts at the input layer, receives input from the previous layer, and outputs to the next layer, until the output layer. There is no feedback throughout the feedforward network, which can be represented by a directed acyclic graph. Feedforward neural networks employ a unidirectional, multi-layered structure. Each layer contains several neurons, and neurons within the same layer are not interconnected; information is transmitted between layers in only one direction.
[0145] A feedforward network can include fully connected layers and activation layers. The fully connected layers can be one or more. They are primarily used for classifying the fused word vectors. The activation layers process the data using activation functions. These activation functions can be softmax functions, which introduce non-linearity to transform the continuous real-valued inputs into outputs between 0 and 1, thus improving the expressive power of the feedforward network. In this implementation, the feedforward network includes two fully connected layers and one activation layer.
[0146] The combined representation is processed by a fully connected layer to obtain a fully connected vector; the fully connected vector is then processed by an activation function through an activation network layer to obtain an image representation vector.
[0147] Specifically, computer equipment performs deconvolution upsampling or pixel-by-pixel prediction on the image representation vector, adjusting the dimension and size of the image representation vector to a preset dimension and size. Based on the adjusted image representation vector, a target Chinese character of the image type can be generated. The target Chinese character of this image type has a preset size, and the image of the target Chinese character is an image of the preset dimensions. If the computer equipment performs deconvolution upsampling on the image representation vector, the generated image is smoother. If the computer equipment performs pixel-by-pixel prediction on the image representation vector, the generated image is clearer and sharper.
[0148] Both the preset dimensions and preset sizes can be set as needed. For example, the preset dimensions can be 3, representing the RGB (Red, Green, Blue) dimensions; the preset dimensions can also be 1, representing grayscale values; and the preset dimensions can also be 4, representing the RGBW (Red, Green, Blue, White) dimensions. For example, the preset size can be 32*32, meaning the target Chinese characters in the image type will be adjusted to an image with a length and width of 32 pixels; the preset size can also be 50*40, meaning the target Chinese characters in the image type will be adjusted to an image with a length of 50 pixels and a width of 40 pixels.
[0149] Figure 12 This is a schematic diagram of a target Chinese character with the target font style generated in one embodiment.
[0150] The computer device can call a Long Short-Term Memory (LSTM) network to predict pixels one by one, adjust the dimension of the image representation vector to a preset dimension, and adjust the size of the image representation vector to a preset size. A Long Short-Term Memory network is a type of recurrent neural network.
[0151] For example, a computer device acquires a target font representation f and a Chinese character representation c, combines the target font representation f and the Chinese character representation c to obtain a combined representation, inputs the combined representation into a feedforward network, and processes the combined representation through the feedforward network as shown in formula (3) to obtain an image representation vector I.
[0152] I=max(0, {f, c}·W0+b0)W1+b1, W0∈R 2H×H W1∈R H×H (3)
[0153] Where W0 and W1 are two predefined matrices, and b0 and b1 are two constants.
[0154] In the above embodiments, the Chinese character representation and the target font representation are combined to obtain a combined representation. The combined representation is processed by a feedforward network to obtain an image representation vector. The dimension of the image representation vector is then adjusted to a preset dimension, and the size of the image representation vector is adjusted to a preset size, so that the target Chinese character of the image type can be accurately generated.
[0155] In one embodiment, the target font style is a target handwritten font style, and the method further includes: using the target Chinese characters as training data to train the Chinese character recognition model to obtain a trained Chinese character recognition model; the trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
[0156] The target handwritten font style is the handwritten font style used to ultimately generate the target Chinese characters. The Chinese character recognition model is the model used to recognize Chinese characters. Specifically, the Chinese character recognition model can be an OCR (optical character recognition) model, which can support text detection and recognition in various scenarios. For example, an OCR model can recognize general text, card text, invoice text, automotive-related text, industry documents, traditional Chinese characters and artistic fonts, vertical Chinese and English text, and multi-angle recognition, etc.
[0157] In one embodiment, the computer device uses the generated target Chinese characters as training data to train the Chinese character recognition model until the training cutoff condition is met, thus obtaining a trained Chinese character recognition model. The computer device then uses the trained Chinese character recognition model to recognize handwritten Chinese characters in the image to be recognized, which can more accurately identify the handwritten Chinese characters in the image.
[0158] In one embodiment, the method further includes a step of recognizing the image to be recognized based on the trained Chinese character recognition model. This step specifically includes: displaying an image recognition interface; uploading an image to be recognized containing handwritten Chinese characters through the image recognition interface; and performing image recognition processing on the image to be recognized using the trained Chinese character recognition model to obtain and output the Chinese character content included in the image to be recognized.
[0159] An image recognition interface is displayed on the screen of a computer device. When the user triggers the upload button, an image containing handwritten Chinese characters is uploaded through the image recognition interface. The computer device calls the trained Chinese character recognition model to process the image to be recognized, which can obtain a more accurate image of the Chinese characters contained in the image and output it.
[0160] like Figure 13 As shown, the computer device uploads an image containing handwritten Chinese characters through an image recognition interface. The trained Chinese character recognition model performs image recognition processing on the image to obtain and output the Chinese character content contained in the image.
[0161] In one embodiment, the image recognition interface of the computer device may also include various function buttons, such as buttons for general text recognition, card / certificate text recognition, invoice / document recognition, vehicle-related recognition, industry document recognition, and intelligent scanning. Each function button's sub-menu may further include sub-functions, such as general printed text, general printed text (high-precision version), general printed text (simplified version), general handwriting, English, fast text detection, and advertising text recognition.
[0162] Figure 14Schematic diagram of a Chinese character generation method in an embodiment. The Chinese character generation method is applied to a computer device and includes the following steps: obtaining a sequence annotation 1402 of the Chinese character to be generated, where the sequence annotation includes 3 components, namely "氵", "艹" and "两", and 2 morphological structures existing in the Chinese character to be generated, namely the left-right structure and the up-down structure. Generating a multi-way tree 1404 corresponding to the Chinese character to be generated based on the components and the morphological structures; the leaf nodes of the multi-way tree 1404 correspond to each component, and the non-leaf nodes correspond to each morphological structure. Based on the parent-child relationship between the nodes in the multi-way tree 1404, 2 triples in the multi-way tree 1404 can be determined, namely 1406 and 1408; each triple includes a parent node and at least two child nodes subordinate to the parent node. In the direction from the leaf nodes to the root node of the multi-way tree 1404, and in the traversal order with the left subtree being prioritized, the traversed multi-tuples are sequentially input into the tree encoder, that is, first traverse to the triple 1406, input the triple 1406 into the tree encoder, obtain the subtree encoding result corresponding to the triple 1406, and then input the triple 1408 into the tree encoder. The preset symbol node "#" in the triple 1408 represents the subtree structure of the triple 1406, then use the subtree encoding result of the triple 1406 as the content of the preset symbol node "#" in the triple 1408, and continue to perform encoding processing on the triple 1408 to obtain the subtree encoding result corresponding to the triple 1408, and the subtree encoding result corresponding to the triple 1408 is also the Chinese character representation 1410 of the Chinese character to be generated. Input the font image of the target font style into the embedding model to obtain the target font representation 1412 corresponding to the target font style. Combine the Chinese character representation 1410 and the target font representation 1412 to obtain a combined representation 1414. Process the combined representation 1414 through a feed-forward network to obtain an image representation vector 1416. Perform deconvolutional upsampling or pixel-by-pixel prediction on the image representation vector to adjust the dimension of the image representation vector 1416 to a preset dimension and adjust the size of the image representation vector to a preset size to obtain an adjusted image representation vector 1418. Generate the target Chinese character "满" of the image type and with the target font style based on the adjusted image representation vector 1418.
[0163] In one embodiment, another Chinese character generation method is provided. Taking this method applied to a computer device as an example for illustration, where the computer device can specifically be Figure 1 a terminal or a server in. The Chinese character generation method includes the following steps:
[0164] Step 1, obtain the input text corresponding to the Chinese character to be generated, and perform radical annotation on the input text; split the input text based on the radical annotation to obtain more than one component; determine at least one morphological structure for forming the input text from the components.
[0165] Step 2: Obtain the hierarchical relationship represented by each shape structure in the Chinese character to be generated, as well as the components corresponding to each shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure.
[0166] Step 3: Use the outermost shape structure as the root node of the multi-branch tree to be generated; if there is a second outermost shape structure, then according to the hierarchical relationship of each shape structure from the outside to the inside, starting from the second outermost shape structure, take the first shape structure corresponding to the current layer as the child node of the second shape structure corresponding to the outermost layer of the current layer; if the shape structure corresponding to each level has corresponding components, then take the corresponding components as the leaf nodes of the node where the shape structure of the corresponding level is located; generate the multi-branch tree based on the root node, leaf nodes, and intermediate nodes between the root node and leaf nodes.
[0167] Step 4: Traverse each node in the multi-branch tree according to the depth-first and left-subtree-first traversal order, and record the corresponding component or shape structure of the corresponding node according to the traversal order to obtain the sequence structure; if the multi-branch tree has only one subtree structure, the sequence structure is represented by a subset; if the multi-branch tree includes multiple subtree structures, the sequence structure is represented by a set composed of multiple nested subsets, with each subset representing a subtree structure; store the sequence structure of the Chinese characters to be generated.
[0168] Step 5: Take each node in the multi-branch tree corresponding to the shape structure as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, then directly take the leaf node belonging to the target parent node as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, then take the preset symbol node as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to the other target parent node; form a tuple by taking each target parent node and the target child nodes belonging to the target parent node; each tuple includes a parent node and at least two child nodes belonging to the parent node.
[0169] Step 6: Traverse the tuples in the multi-way tree sequentially from the leaf node to the root node. If the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, then use the subtree encoding result corresponding to the other tuple as the first vector. If the content of the target child node in the currently traversed tuple is a component, then determine that the data type of the target child node's content is character, and encode the content of the target child node as the second vector. Encode the shape structure corresponding to the target parent node in the currently traversed tuple as the third vector. Based on at least one of the first and second vectors, and the third vector, determine the subtree encoding result corresponding to the currently traversed tuple.
[0170] Step 7: Store the subtree encoding result corresponding to the currently traversed tuple into the Last-In-First-Out (LIFO) stack; if the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, then retrieve the subtree encoding result from the LIFO stack and use it as the content of the target child node of the next tuple.
[0171] Step 8: Obtain the next tuple as the current tuple and continue traversing. Return to the step of encoding the content corresponding to each node in the currently traversed tuple and continue to execute until the root node is reached. Output the encoding result of the subtree corresponding to the tuple where the root node is located as the Chinese character representation.
[0172] Step 9: Select the target font representation corresponding to the target font style from at least one candidate font representation; wherein, the steps for generating candidate font representations include: acquiring handwritten Chinese character images of different font styles; encoding each handwritten Chinese character image to obtain candidate font representations corresponding to each font style.
[0173] Step 10: Combine the Chinese character representation and the target font representation to obtain a combined representation; process the combined representation through a feedforward network to obtain an image representation vector; adjust the dimension of the image representation vector to a preset dimension and adjust the size of the image representation vector to a preset size; generate the target Chinese character of the image type based on the adjusted image representation vector.
[0174] Step 11: Use the target Chinese character as training data to train the Chinese character recognition model to obtain the trained Chinese character recognition model; the trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
[0175] Step 12: Display the image recognition interface; upload an image containing handwritten Chinese characters to be recognized through the image recognition interface; perform image recognition processing on the image to be recognized using the trained Chinese character recognition model to obtain and output the Chinese character content included in the image to be recognized.
[0176] In the above embodiments, more than one component used to construct the Chinese character to be generated, and at least one shape structure existing in the Chinese character to be generated, are obtained to generate a multi-branch tree corresponding to the Chinese character to be generated; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. Then, based on the parent-child relationship between the nodes in the multi-branch tree, at least one tuple in the multi-branch tree corresponding to the Chinese character to be generated can be determined. Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially and encoded based on the traversed tuples until the root node is reached, so as to obtain the Chinese character representation. Then, the target font representation corresponding to the target font style is obtained, and the target Chinese character can be generated. It can be understood that components and shape structures are low-cost and easier-to-obtain data, while images and trajectory information are high-cost and difficult-to-obtain data. Based on components and shape structures, the dependence on images and trajectory information can be avoided, so that Chinese characters can be generated faster and at a lower cost.
[0177] Furthermore, by acquiring the components and shape structure of the Chinese character to be generated, the target Chinese character with that component can be generated more accurately and in a more targeted manner, thereby increasing the number of Chinese characters with that component and making the distribution of Chinese characters with that component in the Chinese character database more balanced; at the same time, increasing the number of Chinese characters with that component can also enhance the ability to extract detailed features of Chinese characters with that component in the future.
[0178] This application also provides an application scenario in which the above-described Chinese character generation method is applied. Specifically, the application of the Chinese character generation method in this scenario is as follows:
[0179] The computer device acquires the input Chinese characters and the target handwriting style desired by the user. It then obtains the components and structural features of the input characters. Based on these components and structural features, a multi-branch tree can be constructed for the input Chinese characters. This multi-branch tree can then be used to encode the Chinese character representation and obtain the target font representation for the desired handwriting style. Finally, based on this character representation and the target font representation, handwritten Chinese characters with the target font style can be generated.
[0180] This application also provides another application scenario in which the above-described Chinese character generation method is applied. Specifically, the application of the Chinese character generation method in this scenario is as follows:
[0181] The computer device acquires the input Chinese characters and the target font style of the artistic font desired by the user. It then obtains the components and structural features of the input Chinese characters. Based on these components and structural features, a multi-branch tree can be constructed for the input Chinese characters. This multi-branch tree can then be used to encode the Chinese character representation and obtain the target font representation of the artistic font style. Finally, based on this Chinese character representation and the target font representation, an artistic font with the target font style can be generated.
[0182] It should be understood that, although Figure 2 The steps in the flowchart are shown sequentially as indicated by the arrows, but these steps are not necessarily executed in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order in which these steps are executed, and they can be performed in other orders. Figure 2 At least some of the steps in the process may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but may be executed at different times. The execution order of these steps or stages is not necessarily sequential, but may be executed in turn or alternately with other steps or at least some of the steps or stages in other steps.
[0183] In one embodiment, such as Figure 15 As shown, a Chinese character generation device is provided. This device can be a software module, a hardware module, or a combination of both, integrated into a computer device. Specifically, the device includes: an acquisition module 1502, a structure processing module 1504, an encoding module 1506, and a Chinese character generation module 1508, wherein:
[0184] The acquisition module 1502 is used to acquire more than one component and at least one shape structure corresponding to the Chinese character to be generated.
[0185] The structure processing module 1504 is used to generate a multi-branch tree corresponding to the Chinese character to be generated based on the components and shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure.
[0186] The structure processing module 1504 is also used to determine at least one tuple in the multi-way tree based on the parent-child relationship between nodes in the multi-way tree; each tuple includes a parent node and at least two child nodes belonging to the parent node.
[0187] The encoding module 1506 is used to traverse the tuples sequentially from the leaf nodes to the root node of the multi-branch tree and perform encoding processing based on the traversed tuples until the root node is reached, so as to obtain the Chinese character representation.
[0188] The Chinese character generation module 1508 is used to obtain the target font representation corresponding to the target font style, and generate the target Chinese characters based on the Chinese character representation and the target font representation.
[0189] The aforementioned Chinese character generation device acquires more than one component and at least one shape structure corresponding to the Chinese character to be generated, thereby generating a multi-branch tree corresponding to the Chinese character to be generated. The leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. Based on the parent-child relationships between nodes in the multi-branch tree, at least one tuple in the multi-branch tree corresponding to the Chinese character to be generated can be determined. Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially, and encoding processing is performed on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation. Then, the target font representation corresponding to the target font style is acquired, and the target Chinese character can be generated. It is understandable that components and shape structures are low-cost and easier-to-obtain data, while images and trajectory information are high-cost and difficult-to-obtain data. Based on components and shape structures, dependence on image and trajectory information can be avoided, thus enabling faster and lower-cost generation of Chinese characters, providing a large amount of training data for the Chinese character recognition model quickly and cost-effectively; it can also reduce the time cost of collecting Chinese character data for the target font style.
[0190] Furthermore, acquiring the components and structural features of the Chinese character to be generated allows for more accurate and targeted generation of target characters containing those components, thereby increasing the number of characters with those components and making the distribution of characters with those components in the Chinese character database more balanced. Increasing the number of characters with those components also enhances the ability to extract detailed features from characters with those components. For example, target characters with rare components can be generated by strategically combining rare components with other components to create new characters. This results in a larger number of characters with those rare components in the Chinese character database, a more balanced distribution, and provides more characters with those rare components to the Chinese character recognition model, thus improving the accuracy of the model in recognizing those rare components.
[0191] In addition, after generating target Chinese characters with the target font style, it is also possible to obtain the annotations of the target Chinese characters, including components and shape structure, and use the annotated Chinese characters as data for other tasks to augment the data of other tasks.
[0192] In one embodiment, the acquisition module 1502 is further configured to acquire the input text corresponding to the Chinese character to be generated, and to annotate the input text with radicals; to split the input text based on the radical annotations to obtain more than one component; and to determine at least one shape structure for forming the input text by the components.
[0193] In one embodiment, the structure processing module 1504 is further configured to obtain the hierarchical relationship represented by each shape structure in the Chinese character to be generated, and the components corresponding to each shape structure; and to arrange the shape structure and its corresponding components in the order from the outside to the inside according to the hierarchical relationship corresponding to each shape structure to generate a corresponding multi-branch tree.
[0194] In one embodiment, the structure processing module 1504 is further configured to use the outermost shape structure as the root node of the multi-branch tree to be generated; if there is a second outer shape structure, then according to the hierarchical relationship corresponding to each shape structure from the outside to the inside, starting from the second outer shape structure, the first shape structure corresponding to the current layer is used as the child node of the second shape structure corresponding to the outermost layer of the current layer; if the shape structure corresponding to each layer has a corresponding component, then the corresponding component is used as the leaf node of the node where the shape structure of the corresponding layer is located; and a multi-branch tree is generated based on the root node, leaf nodes, and intermediate nodes between the root node and leaf nodes.
[0195] In one embodiment, the structure processing module 1504 is further configured to treat each node in the multi-branch tree corresponding to the shape structure as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, then the leaf node belonging to the target parent node is directly used as the target child node of the corresponding target parent node; if the child node of the target parent node is another target parent node, then the preset symbol node is used as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to the other target parent node; and each target parent node and the target child node belonging to the target parent node form a tuple.
[0196] In one embodiment, the encoding module 1506 is further configured to traverse the tuples in the multi-way tree sequentially from the leaf node to the root node; encode the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple; if the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple; obtain the next tuple as the current tuple and continue traversing, return to the step of encoding the content corresponding to each node in the currently traversed tuple and continue execution until the root node is reached; and output the subtree encoding result corresponding to the tuple where the root node is located as the Chinese character representation.
[0197] In one embodiment, the encoding module 1506 is further configured to: if the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, then use the subtree encoding result corresponding to the other tuple as a first vector; if the content of the target child node in the currently traversed tuple is a component, then determine that the data type of the content of the target child node is a character, and encode the content of the target child node as a second vector; encode the shape structure corresponding to the target parent node in the currently traversed tuple as a third vector; and determine the subtree encoding result corresponding to the currently traversed tuple based on at least one of the first and second vectors, and the third vector.
[0198] In one embodiment, the encoding module 1506 is further configured to store the subtree encoding result corresponding to the currently traversed tuple into a last-in-first-out stack; if the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, the subtree encoding result is obtained from the last-in-first-out stack and used as the content of the target child node of the next tuple.
[0199] In one embodiment, the Chinese character generation module 1508 is further configured to filter out a target font representation corresponding to the target font style from at least one candidate font representation. The acquisition module 1502 is further configured to acquire handwritten Chinese character images of different font styles; and encode each handwritten Chinese character image to obtain candidate font representations corresponding to each font style.
[0200] In one embodiment, the Chinese character generation module 1508 is further used to combine the Chinese character representation and the target font representation to obtain a combined representation; process the combined representation through a feedforward network to obtain an image representation vector; adjust the dimension of the image representation vector to a preset dimension and adjust the size of the image representation vector to a preset size, and generate the target Chinese character of the image type based on the adjusted image representation vector.
[0201] In one embodiment, the above-mentioned device further includes a training module, which is used to train the Chinese character recognition model by using the target Chinese character as training data to obtain a trained Chinese character recognition model; the trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
[0202] In one embodiment, the device further includes a recognition module for displaying an image recognition interface; uploading an image to be recognized containing handwritten Chinese characters through the image recognition interface; and performing image recognition processing on the image to be recognized using a trained Chinese character recognition model to obtain and output the Chinese character content included in the image to be recognized.
[0203] In one embodiment, the structure processing module 1504 is further configured to traverse each node in the multi-branch tree according to the traversal order of depth-first and left subtree-first, and record the component or shape structure corresponding to the corresponding node according to the traversal order to obtain the sequence structure; wherein, if the multi-branch tree has only one subtree structure, the sequence structure is represented by a subset; if the multi-branch tree includes multiple subtree structures, the sequence structure is represented by a set composed of multiple subsets nested in sequence, and each subset represents a subtree structure.
[0204] For specific limitations regarding the Chinese character generation device, please refer to the limitations on the Chinese character generation method above, which will not be repeated here. Each module in the aforementioned Chinese character generation device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in the processor of a computer device in hardware form or independent of it, or stored in the memory of a computer device in software form, so that the processor can call and execute the operations corresponding to each module.
[0205] In one embodiment, a computer device is provided, which may be a terminal or a server, and its internal structure diagram may be as follows: Figure 16 As shown, the computer device includes a processor, memory, and a network interface connected via a system bus. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database stores data such as multi-branch trees, sequence labels, font representations, and character representations of the Chinese characters to be generated. The network interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements a Chinese character generation method.
[0206] Those skilled in the art will understand that Figure 16 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0207] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.
[0208] In one embodiment, a computer-readable storage medium is provided storing a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0209] In one embodiment, a computer program product or computer program is provided, the computer program product or computer program including computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, causing the computer device to perform the steps in the above method embodiments.
[0210] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the methods described above. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, or optical storage, etc. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM), etc.
[0211] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0212] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.
Claims
1. A method for generating Chinese characters, characterized in that, The method includes: Obtain more than one component and at least one shape structure corresponding to the Chinese character to be generated; A multi-branch tree corresponding to the Chinese character to be generated is generated based on the components and the shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. Based on the parent-child relationships between nodes in the multi-way tree, at least one tuple in the multi-way tree is determined; each tuple includes a parent node and at least two child nodes belonging to the parent node; wherein, the parent node in the tuple corresponds to a shape structure, when the child node of the parent node is a leaf node, the leaf node is a child node in the tuple, when the child node of the parent node is the parent node in another tuple, a preset symbol node is a child node in the tuple, and the preset symbol node represents the subtree structure corresponding to the parent node in the other tuple; Following the direction from the leaf node to the root node of the multi-branch tree, the tuples are traversed sequentially and the encoding process is performed based on the traversed tuples until the root node is reached, thus obtaining the Chinese character representation. Obtain the target font representation corresponding to the target font style, and generate the target Chinese character based on the Chinese character representation and the target font representation.
2. The method according to claim 1, characterized in that, The acquisition of more than one component and at least one shape structure corresponding to the Chinese character to be generated includes: Obtain the input text corresponding to the Chinese character to be generated, and annotate the radicals of the input text; The input text is split based on the radical annotations to obtain more than one component; Determine at least one shape structure for forming the input text using the components.
3. The method according to claim 1, characterized in that, The process of generating a multi-branch tree corresponding to the Chinese character to be generated based on the components and the shape structure includes: Obtain the hierarchical relationship represented by each of the aforementioned shapes and structures in the Chinese character to be generated, as well as the components corresponding to each shape and structure; Arrange the shapes and their corresponding components in order from the outside to the inside according to the hierarchical relationship of each shape structure to generate a corresponding multi-branch tree.
4. The method according to claim 3, characterized in that, The step of arranging the physical structures and their corresponding components in a hierarchical order from the outside to the inside, according to the hierarchical relationship corresponding to each physical structure, to generate a corresponding multi-branch tree, includes: The outermost shape structure is used as the root node of the multi-way tree to be generated; If there is a sub-outer layer of shape structure, then according to the hierarchical relationship corresponding to each shape structure from the outside to the inside, starting from the sub-outer layer of shape structure, the first shape structure corresponding to the current layer is taken as the child node of the second shape structure corresponding to the outermost layer of the current layer. If the shape structure corresponding to each level has corresponding components, then the corresponding components are taken as the leaf nodes of the nodes where the shape structure of the corresponding level is located. A multi-branch tree is generated based on the root node, the leaf nodes, and the intermediate nodes between the root node and the leaf nodes.
5. The method according to claim 1, characterized in that, The step of determining at least one tuple in the multi-way tree based on the parent-child relationships between nodes includes: Each node in the multi-way tree corresponding to the shape structure is taken as a target parent node; For each target parent node, if the child node of the target parent node is a leaf node, then the leaf node belonging to the target parent node is directly used as the target child node of the corresponding target parent node. If the child node of the target parent node is another target parent node, then the preset symbol node is used as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to the other target parent node; Each target parent node and its child nodes constitute a tuple.
6. The method according to claim 1, characterized in that, The process of traversing the tuples sequentially from the leaf nodes to the root node of the multi-way tree and encoding them according to the traversed tuples until the root node is reached, to obtain the Chinese character representation, includes: Traverse the tuples in the multi-way tree sequentially from the leaf node to the root node; Encode the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple; If the tuple currently being traversed corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the tuple currently being traversed is used as the content of the target child node of the other tuple. Obtain the next tuple as the current tuple and continue traversing. Return to the step of encoding the content corresponding to each node in the currently traversed tuple and continue execution until the root node is reached. The encoding result of the subtree corresponding to the tuple containing the root node is output as the Chinese character representation.
7. The method according to claim 6, characterized in that, The step of encoding the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple includes: If the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, then the subtree encoding result corresponding to the other tuple is used as the first vector; If the content of the target child node in the currently traversed tuple is a component, then the data type of the content of the target child node is determined to be a character, and the content of the target child node is encoded into a second vector; Encode the shape structure corresponding to the target parent node in the currently traversed tuple into a third vector; Based on at least one of the first and second vectors, and the third vector, the subtree encoding result corresponding to the currently traversed tuple is determined.
8. The method according to claim 6, characterized in that, If the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple, including: Store the subtree encoding result corresponding to the currently traversed tuple into a last-in-first-out stack; If the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, then the subtree encoding result is obtained from the last-in-first-out stack and used as the content of the target child node of the next tuple.
9. The method according to claim 1, characterized in that, The process of obtaining the target font representation corresponding to the target font style includes: Select the target font representation that corresponds to the target font style from at least one candidate font representation; The steps for generating the candidate font representation include: Obtain images of handwritten Chinese characters in different font styles; Each of the handwritten Chinese character images is encoded to obtain candidate font representations corresponding to each font style.
10. The method according to claim 1, characterized in that, The process of generating target Chinese characters based on the Chinese character representation and the target font representation includes: By combining the Chinese character representation and the target font representation, a combined representation is obtained; The combined representation is processed by a feedforward network to obtain an image representation vector; The dimension of the image representation vector is adjusted to a preset dimension, and the size of the image representation vector is adjusted to a preset size. Based on the adjusted image representation vector, the target Chinese character of the image type is generated.
11. The method according to any one of claims 1 to 10, characterized in that, The target font style is a target handwritten font style, and the method further includes: The target Chinese character is used as training data to train the Chinese character recognition model, resulting in a trained Chinese character recognition model. The trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
12. The method according to claim 11, characterized in that, The method further includes: Displaying the image recognition interface; Upload an image containing handwritten Chinese characters to be recognized through the image recognition interface; The trained Chinese character recognition model is used to perform image recognition processing on the image to be recognized, and the Chinese character content included in the image to be recognized is obtained and output.
13. The method according to any one of claims 1 to 10, characterized in that, The method further includes: The nodes in the multi-way tree are traversed according to the depth-first and left-subtree-first traversal order, and the components or shapes corresponding to the nodes are recorded according to the traversal order to obtain the sequence structure. If the multi-way tree has only one subtree structure, the sequence structure is represented by a subset. If the multi-way tree includes multiple subtree structures, the sequence structure is represented by a set of multiple nested subsets, with each subset representing a subtree structure.
14. A Chinese character generation device, characterized in that, The device includes: The acquisition module is used to acquire more than one component and at least one shape structure corresponding to the Chinese character to be generated; The structure processing module is used to generate a multi-branch tree corresponding to the Chinese character to be generated based on the components and the shape structure; the leaf nodes of the multi-branch tree correspond to each component, and the non-leaf nodes correspond to each shape structure. The structure processing module is further configured to determine at least one tuple in the multi-way tree based on the parent-child relationship between nodes in the multi-way tree; each tuple includes a parent node and at least two child nodes belonging to the parent node; wherein, the parent node in the tuple corresponds to a shape structure, when the child node of the parent node is a leaf node, the leaf node is a child node in the tuple, when the child node of the parent node is the parent node in another tuple, a preset symbol node is a child node in the tuple, and the preset symbol node represents the subtree structure corresponding to the parent node in the other tuple; The encoding module is used to traverse the tuples sequentially from the leaf node to the root node of the multi-way tree and perform encoding processing based on the traversed tuples until the root node is reached, so as to obtain the Chinese character representation. The Chinese character generation module is used to obtain the target font representation corresponding to the target font style, and generate the target Chinese character based on the Chinese character representation and the target font representation.
15. The apparatus according to claim 14, characterized in that, The acquisition module is also used to acquire the input text corresponding to the Chinese character to be generated, and to annotate the input text with radicals; based on the radical annotations, the input text is split to obtain more than one component; Determine at least one shape structure for forming the input text using the components.
16. The apparatus according to claim 14, characterized in that, The structure processing module is also used to obtain the hierarchical relationship represented by each of the shape structures in the Chinese character to be generated, as well as the components corresponding to each shape structure; and to arrange the shape structures and their corresponding components in the order from the outside to the inside according to the hierarchical relationship corresponding to each shape structure to generate a corresponding multi-branch tree.
17. The apparatus according to claim 16, characterized in that, The structure processing module is further configured to use the outermost shape structure as the root node of the multi-branch tree to be generated; if there is a second outermost shape structure, then according to the hierarchical relationship corresponding to each shape structure from the outside to the inside, starting from the second outermost shape structure, the first shape structure corresponding to the current layer is sequentially used as the child node of the second shape structure corresponding to the outermost layer of the current layer; if the shape structure corresponding to each layer has a corresponding component, then the corresponding component is used as the leaf node of the node where the shape structure of the corresponding layer is located; based on the root node, the leaf node, and the intermediate nodes between the root node and the leaf node, a multi-branch tree is generated.
18. The apparatus according to claim 14, characterized in that, The structure processing module is also used to treat each node in the multi-branch tree corresponding to the shape structure as a target parent node; for each target parent node, if the child node of the target parent node is a leaf node, then the leaf node belonging to the target parent node is directly used as the target child node of the corresponding target parent node. If the child node of the target parent node is another target parent node, then the preset symbol node is used as the target child node of the corresponding target parent node; the preset symbol node represents the subtree structure corresponding to the other target parent node; each target parent node and the target child node belonging to the target parent node form a tuple.
19. The apparatus according to claim 14, characterized in that, The encoding module is further configured to traverse the tuples in the multi-way tree sequentially from the leaf node to the root node; encode the content corresponding to each node in the currently traversed tuple to obtain the subtree encoding result corresponding to the currently traversed tuple; if the currently traversed tuple corresponds to the target child node of another tuple, then the subtree encoding result corresponding to the currently traversed tuple is used as the content of the target child node of the other tuple; obtain the next tuple as the current tuple and continue traversing, and return to the step of encoding the content corresponding to each node in the currently traversed tuple to continue execution until the root node is reached. The encoding result of the subtree corresponding to the tuple containing the root node is output as the Chinese character representation.
20. The apparatus according to claim 19, characterized in that, The encoding module is further configured to: if the target child node in the currently traversed tuple corresponds to the target parent node of another tuple, then use the subtree encoding result corresponding to the other tuple as a first vector; if the content of the target child node in the currently traversed tuple is a component, then determine that the data type of the content of the target child node is a character, and encode the content of the target child node as a second vector; encode the shape structure corresponding to the target parent node in the currently traversed tuple as a third vector; and determine the subtree encoding result corresponding to the currently traversed tuple based on at least one of the first and second vectors, and the third vector.
21. The apparatus according to claim 19, characterized in that, The encoding module is further configured to store the subtree encoding result corresponding to the currently traversed tuple into a last-in-first-out stack; if the target child node of the next tuple to be traversed corresponds to the target parent node of the currently traversed tuple, the subtree encoding result is obtained from the last-in-first-out stack and used as the content of the target child node of the next tuple.
22. The apparatus according to claim 14, characterized in that, The Chinese character generation module is further configured to filter out the target font representation corresponding to the target font style from at least one candidate font representation; the acquisition module is further configured to acquire handwritten Chinese character images of different font styles; Each of the handwritten Chinese character images is encoded to obtain candidate font representations corresponding to each font style.
23. The apparatus according to claim 14, characterized in that, The Chinese character generation module is also used to combine the Chinese character representation and the target font representation to obtain a combined representation; process the combined representation through a feedforward network to obtain an image representation vector; adjust the dimension of the image representation vector to a preset dimension and adjust the size of the image representation vector to a preset size, and generate the target Chinese character of the image type based on the adjusted image representation vector.
24. The apparatus according to any one of claims 14 to 23, characterized in that, The target font style is a target handwritten font style. The device also includes a training module, which is used to train the Chinese character recognition model using the target Chinese character as training data to obtain a trained Chinese character recognition model. The trained Chinese character recognition model is used to recognize handwritten Chinese characters in the image to be recognized.
25. The apparatus according to claim 24, characterized in that, The device further includes a recognition module, which is used to display an image recognition interface; upload an image to be recognized containing handwritten Chinese characters through the image recognition interface; perform image recognition processing on the image to be recognized through the trained Chinese character recognition model to obtain and output the Chinese character content included in the image to be recognized.
26. The apparatus according to any one of claims 14 to 23, characterized in that, The structure processing module is also used to traverse each node in the multi-branch tree according to the traversal order of depth first and left subtree first, and record the component or shape structure corresponding to the corresponding node according to the traversal order to obtain the sequence structure; wherein, if the multi-branch tree has only one subtree structure, the sequence structure is represented by a subset; if the multi-branch tree includes multiple subtree structures, the sequence structure is represented by a set composed of multiple subsets nested in sequence, and each subset represents a subtree structure.
27. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 13.
28. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 13.
29. A computer program product comprising computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the steps of the method according to any one of claims 1 to 13.