Text translation method, apparatus, device, storage medium, and computer program product
By using a large language model to perform text recognition and paragraph segmentation on PDF files, the tedious problem of manually segmenting PDF files for translation is solved, achieving automated paragraph recognition and translation, thus improving translation efficiency and reading experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING QIHOOD TECHNOLOGY CO LTD
- Filing Date
- 2024-12-17
- Publication Date
- 2026-06-19
Smart Images

Figure CN122242534A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of intelligent translation technology, and in particular to text translation methods, apparatus, devices, storage media, and computer program products. Background Technology
[0002] Currently, when conducting academic research or other research, it is often necessary to read a large amount of existing materials, which naturally leads to many situations where translation is required. With the continuous development of computer science, text translation can now be achieved intelligently. However, the translation of PDF files currently requires manual identification to segment the content in the PDF and then pasting the content into translation software, which is extremely inconvenient.
[0003] The above content is only used to help understand the technical solution of this application and does not represent an admission that the above content is prior art. Summary of the Invention
[0004] The main purpose of this application is to provide a text translation method, apparatus, device, storage medium, and computer program product, which aims to solve the cumbersome technical problem that current PDF translation requires manual identification, segmentation, and pasting of content into translation software.
[0005] To achieve the above objectives, this application proposes a document translation method, which includes:
[0006] In response to a file translation command, extract text block data of text objects in the target file using a large language model;
[0007] Paragraph data is generated based on the text block data and the text line data of the text object;
[0008] The paragraphs are split based on the paragraph coverage information and content type information of the paragraph data. The split paragraph data is then translated and displayed according to the target layout scheme of the target file.
[0009] Optionally, the step of extracting text block data of text objects in the target file using a large language model in response to a file translation instruction includes:
[0010] In response to the file translation instruction, determine the target file;
[0011] The target file is imported into the large language model to obtain the original content and coordinate information of the target file;
[0012] The text block data of the text object in the target file is determined based on the original content and the coordinate information.
[0013] Optionally, the step of generating paragraph data based on the text block data and the text line data of the text object includes:
[0014] Determine the text line data of the text object based on the text block data;
[0015] The paragraph data of the text object is generated based on the text line data.
[0016] Optionally, the step of determining the text line data of the text object based on the text block data includes:
[0017] Determine text block classification data based on the text block data;
[0018] The text line break information is determined based on the text block classification data;
[0019] Text line data is generated based on the text line break information.
[0020] Optionally, the step of generating text line data based on the text line break information includes:
[0021] Determine the line content information and line font size information based on the text line break information;
[0022] Text line data is generated based on the line content information and line font size information.
[0023] Optionally, the step of generating paragraph data of the text object based on the text line data includes:
[0024] Determine the vertical distance information, beginning information, and end information of the lines based on the text line data;
[0025] The row spacing is determined based on the vertical distance between rows;
[0026] The paragraph data of the text object is generated based on the line spacing, line start information, and line end information.
[0027] Optionally, the step of generating paragraph data of the text object based on the line spacing, line start information, and line end information includes:
[0028] Determine the line beginning difference distance based on the line beginning information;
[0029] The line tail difference distance is determined based on the line tail information;
[0030] The adjacent line classification information of the text object is determined based on the line beginning difference distance, the line ending difference distance, and the line spacing.
[0031] The paragraph data of the text object is generated based on the adjacent line classification information.
[0032] Optionally, the step of splitting paragraphs based on the paragraph coverage information and content type information of the paragraph data, translating the split paragraph data, and displaying it according to the target layout scheme of the target file includes:
[0033] The paragraph data is analyzed using the large language model to obtain paragraph coverage information and content type information;
[0034] The paragraphs are split according to the content type information and the paragraph coverage information to obtain the split paragraph data;
[0035] The split paragraph data is translated and displayed according to the target layout scheme of the target file.
[0036] Optionally, the step of analyzing the paragraph data using the large language model to obtain paragraph coverage information and content type information includes:
[0037] Text recognition is performed on the paragraph data obtained by the large language model to obtain text overlap information;
[0038] Determine paragraph coverage information based on the text overlap information;
[0039] The paragraph data is semantically analyzed using the large language model to obtain content type information.
[0040] Optionally, the step of splitting paragraphs based on the content type information and the paragraph coverage information to obtain the split paragraph data includes:
[0041] Generate a content splitting scheme based on the content type information;
[0042] A paragraph splitting scheme is generated based on the paragraph coverage information;
[0043] The paragraphs are split according to the content splitting scheme and the paragraph splitting scheme to obtain the split paragraph data.
[0044] Optionally, the step of generating a content splitting scheme based on the content type information includes:
[0045] The paragraph type classification is determined based on the content type information, and the paragraph type classification includes body text type, table of contents type, and table type.
[0046] The paragraph data is split into paragraphs based on the paragraph type classification to obtain a content splitting scheme.
[0047] Optionally, the step of generating a paragraph splitting scheme based on the paragraph coverage information includes:
[0048] The overlapping paragraphs are determined based on the paragraph coverage information;
[0049] Based on the overlapping paragraphs, paragraph analysis is performed on the paragraph data to obtain the re-segmentation information of the overlapping paragraphs;
[0050] Determine the paragraph splitting scheme based on the re-segmentation information.
[0051] Furthermore, to achieve the above objectives, this application also proposes a document translation apparatus, which includes:
[0052] The text extraction module is used to extract text block data of text objects in the target file in response to file translation commands, using a large language model.
[0053] The paragraph generation module is used to generate paragraph data based on the text block data and the text line data of the text object;
[0054] The segmentation and display module is used to segment paragraphs based on the paragraph coverage information and content type information of the paragraph data, translate the segmented paragraph data, and display it according to the target layout scheme of the target file.
[0055] Optionally, the text extraction module is further configured to, in response to a file translation instruction, determine a target file; import the target file into a large language model to obtain the original content and coordinate information of the target file; and determine the text block data of the text objects in the target file based on the original content and the coordinate information.
[0056] Optionally, the paragraph generation module is further configured to determine the text line data of the text object based on the text block data; and generate the paragraph data of the text object based on the text line data.
[0057] Optionally, the paragraph generation module is further configured to determine text block classification data based on the text block data; determine text line break information based on the text block classification data; and generate text line data based on the text line break information.
[0058] Optionally, the paragraph generation module is further configured to determine line content information and line font size information based on the text line break information; and generate text line data based on the line content information and line font size information.
[0059] In addition, to achieve the above objectives, this application also proposes a document translation device, the device comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program being configured to implement the steps of the document translation method as described above.
[0060] In addition, to achieve the above objectives, this application also proposes a storage medium, which is a computer-readable storage medium, on which a computer program is stored, and which, when executed by a processor, implements the steps of the file translation method described above.
[0061] In addition, to achieve the above objectives, this application also provides a computer program product, which includes a computer program that, when executed by a processor, implements the steps of the document translation method described above.
[0062] One or more technical solutions proposed in this application have at least the following technical effects:
[0063] This application, in response to a document translation command, extracts text block data of text objects in a target file using a large language model; generates paragraph data based on the text block data and the text line data of the text objects; splits the paragraphs based on paragraph coverage information and content type information; translates the split paragraph data; and displays it according to the target file's target layout scheme. In this way, it achieves optimized text and paragraph recognition in PDF files using a large language model, followed by paragraph splitting based on paragraph coverage and content type. This allows for the translation of the optimized paragraphs and their display according to the original file's layout, ensuring automatic paragraph recognition and translation within the original file's layout. This simplifies the translation process while improving the reading experience. Attached Figure Description
[0064] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.
[0065] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0066] Figure 1 This is a flowchart illustrating an embodiment of the document translation method for this application.
[0067] Figure 2 This is a flowchart illustrating Embodiment 2 of the method for translating the documents in this application.
[0068] Figure 3 This is a schematic diagram of the L-shaped layout splitting provided for an embodiment of the document translation method of this application;
[0069] Figure 4This is a schematic diagram of the directory layout splitting provided for an embodiment of the document translation method of this application;
[0070] Figure 5 This is a schematic diagram of the module structure of the document translation device according to an embodiment of this application;
[0071] Figure 6 This is a schematic diagram of the device structure of the hardware operating environment involved in the document translation method in this application embodiment.
[0072] The purpose, features, and advantages of this application will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0073] It should be understood that the specific embodiments described herein are merely illustrative of the technical solutions of this application and are not intended to limit this application.
[0074] To better understand the technical solution of this application, a detailed description will be provided below in conjunction with the accompanying drawings and specific implementation methods.
[0075] The main solution of this application embodiment is: in response to a file translation instruction, extract text block data of text objects in the target file using a large language model; generate paragraph data based on the text block data and the text line data of the text objects; split the paragraphs based on the paragraph coverage information and content type information of the paragraph data; translate the split paragraph data and display it according to the target layout scheme of the target file.
[0076] In this embodiment, for ease of description, the following description will use the identification computer as the execution subject.
[0077] Because current technology often requires reading a large amount of existing materials when conducting academic research or other research, there are naturally many situations that require translation. With the continuous development of computer science, text translation can now be intelligent. However, the translation of PDF files currently requires manual identification to segment the content in the PDF and then pasting the content into translation software, which is extremely inconvenient.
[0078] This application provides a solution that uses a large language model to recognize text and paragraphs in PDF files, and then optimizes the segmentation of paragraphs based on paragraph coverage and content type. The optimized paragraphs can then be translated and displayed according to the original file layout, thus ensuring automatic paragraph recognition and translation within the original file layout. This simplifies the translation process while improving the reading experience.
[0079] It should be noted that the executing entity in this embodiment can be a computing service device with data processing, network communication, and program execution functions, such as a tablet computer, personal computer, or mobile phone, or an electronic device or server capable of performing the above functions. The following description uses a computer as an example to illustrate this embodiment and the subsequent embodiments.
[0080] Based on this, the embodiments of this application provide a document translation method, referring to... Figure 1 , Figure 1 This is a flowchart illustrating the first embodiment of the document translation method of this application.
[0081] In this embodiment, the document translation method includes steps S10 to S30:
[0082] Step S10: In response to the file translation instruction, extract text block data of text objects in the target file using a large language model;
[0083] It should be noted that the process first receives a file translation command triggered by the user, meaning the user needs to automatically translate a target file. The target file can be a PDF file, a PPT file, or an image file, etc., containing text. This embodiment uses a PDF file as an example. First, the PDF file is identified, and then text block data is extracted using a large language model. The target language for translation can be any language; this embodiment does not impose any limitations on it.
[0084] In one feasible implementation, in order to accurately obtain text block data, step S10 includes: in response to a file translation instruction, determining a target file; importing the target file into a large language model to obtain the original content and coordinate information of the target file; and determining the text block data of the text objects in the target file based on the original content and the coordinate information.
[0085] It should be understood that, firstly, the address, storage address, or file itself of the target file to be translated is determined according to the file translation instructions.
[0086] In practice, after obtaining the target file, it is imported into the large language model to obtain the original text content of the target file, as well as the coordinate information of each character block. The large language model is a pre-trained deep learning model that includes a PDF parsing library, capable of automatically reading the content and coordinates of PDF files.
[0087] It should be noted that after obtaining the original content and coordinate information, the text block data of the text object in the target file can be determined based on the original content and coordinate information, which includes absolute positioning coordinate information.
[0088] Step S20: Generate paragraph data based on the text block data and the text line data of the text object;
[0089] It should be understood that after obtaining the text block data, processing the text block data can yield the text line data of the target file, which can then be organized to obtain the paragraph data.
[0090] In one feasible implementation, in order to accurately obtain paragraph data, step S20 includes: determining the text line data of the text object based on the text block data; and generating the paragraph data of the text object based on the text line data.
[0091] In practice, the text line data of the text object is first determined based on the text block data. This involves analyzing the coordinates and features of each character to obtain multiple text lines in the target file. Then, the features and data of the text lines are analyzed to summarize multiple paragraphs, i.e., paragraph data.
[0092] In one feasible implementation, in order to accurately obtain text line data, the step of determining the text line data of the text object based on the text block data includes: determining text block classification data based on the text block data; determining text line information based on the text block classification data; and generating text line data based on the text line information.
[0093] It should be noted that the original text block data is traversed to determine which text blocks belong to the same line based on their coordinate information. The text blocks in the same line are then classified to obtain text block classification data, and line data is generated based on the text block classification data.
[0094] In one feasible implementation, in order to accurately generate text line data, the step of generating text line data based on the text line break information includes: determining line content information and line font size information based on the text line break information; and generating text line data based on the line content information and line font size information.
[0095] It should be understood that, firstly, the line content information and line size information of each line are determined based on the line break information. Then, based on the line content information and line size information, the text line to which each text block belongs is determined, as well as the position, size and other parameter information of each text line, and finally the text line data is obtained.
[0096] In one feasible implementation, in order to obtain paragraph data of a text object based on the generated text line data, the step of generating paragraph data of the text object according to the text line data includes: determining line vertical distance information, line beginning information, and line ending information according to the text line data; determining line spacing according to the line vertical distance; and generating paragraph data of the text object according to the line spacing, line beginning information, and line ending information.
[0097] In practice, the vertical distance between each line of text in the text line data is first determined, as well as the position and spacing of the beginning and end of each text line. The line spacing can then be determined based on the vertical distance information. Finally, the text lines and text blocks contained in each paragraph of the text object are determined based on the line spacing of each adjacent text line, as well as the relevant parameters and positions of the beginning and end of the line.
[0098] In one feasible implementation, in order to accurately obtain paragraph data, the step of generating paragraph data of the text object based on the line spacing, line start information, and line end information includes: determining the line start distance based on the line start information; determining the line end distance based on the line end information; determining the adjacent line classification information of the text object based on the line start distance, the line end distance, and the line spacing; and generating paragraph data of the text object based on the adjacent line classification information.
[0099] It should be noted that, firstly, the line beginning distance is determined based on the line beginning information, which is the horizontal distance between the beginning positions of each text line. Then, the line ending distance is determined based on the line ending information, which is the horizontal distance between the end positions of each text line.
[0100] It should be understood that after determining the distance between the beginning and end of a line, the classification information of adjacent lines is determined in conjunction with the line spacing. Specifically, each paragraph requires indentation at the beginning, so the line spacing at the beginning of each paragraph differs from that of a regular text line. Similarly, the short end of a paragraph may contain line breaks, so the line spacing at the end of a text line differs from that of a regular text line.
[0101] In practice, after determining the classification information of adjacent lines, adjacent lines with differences in the beginning, end, or line spacing are identified as paragraph boundaries, thereby determining multiple text paragraphs, i.e., paragraph data of text objects.
[0102] Step S30: Segment the paragraphs according to the paragraph coverage information and content type information of the paragraph data, translate the segmented paragraph data and display it according to the target layout scheme of the target file.
[0103] It should be noted that after determining the paragraph data, the paragraph coverage information and content type information of each currently identified paragraph are determined. This involves determining whether there is regional overlap and whether special text types need to be split. In this way, the paragraphs can be split and automatically translated, and then displayed according to the original layout, which improves the user's reading experience and reduces the probability of translation errors.
[0104] This embodiment provides a document translation method. In response to a document translation command, it extracts text block data of text objects in a target file using a large language model; generates paragraph data based on the text block data and the text line data of the text objects; splits the paragraphs based on paragraph coverage information and content type information; translates the split paragraph data; and displays it according to the target file's target layout. This method achieves text and paragraph recognition in PDF files using a large language model, followed by optimized paragraph splitting based on paragraph coverage and content type. The resulting optimized paragraphs are then translated and displayed according to the original file's layout, ensuring automatic paragraph recognition and translation within the original file's layout. This simplifies the translation process while improving the reading experience.
[0105] Based on the first embodiment of this application, in the second embodiment of this application, the content that is the same as or similar to that in the first embodiment described above can be referred to the above description, and will not be repeated hereafter. Based on this, please refer to... Figure 2 Step S30 includes steps S301 to S303:
[0106] Step S301: Analyze the paragraph data using the large language model to obtain paragraph coverage information and content type information;
[0107] It should be noted that the paragraph data is first input into the large language model to determine the paragraph coverage information and content type information, that is, whether there is overlapping coverage between paragraphs and the text content of each identified paragraph.
[0108] In one feasible implementation, in order to determine paragraph coverage information and content type information, step S301 includes: determining paragraph coverage information based on the text overlap information; and performing semantic analysis on the paragraph data using the large language model to obtain content type information.
[0109] It should be understood that, firstly, text recognition is performed on paragraph data using a large language model, thereby obtaining text overlap information, that is, the overlap of the recognition boxes of various text blocks. Then, based on the text overlap information, the coverage information between each paragraph is determined.
[0110] In practice, when the paragraph data is input into the large language model, semantic understanding and analysis are also performed to obtain the content type of each paragraph from the summary. The content type can include charts, tables of contents, body text, etc.
[0111] Step S302: Perform paragraph splitting based on the content type information and the paragraph coverage information to obtain the split paragraph data;
[0112] It should be noted that the paragraph splitting was optimized from two aspects: content type and paragraph overlap, to prevent overlapping, ambiguity, and repetition of translated text, resulting in the split paragraph data.
[0113] In a feasible implementation, in order to accurately obtain the segmented paragraph data, step S302 includes: generating a content segmentation scheme based on the content type information; generating a paragraph segmentation scheme based on the paragraph coverage information; and performing paragraph segmentation based on the content segmentation scheme and the paragraph segmentation scheme to obtain the segmented paragraph data.
[0114] It should be understood that, firstly, a content splitting scheme is generated based on content type information, that is, a splitting scheme based on content type. Then, another paragraph splitting scheme is proposed based on paragraph coverage information. Finally, the content splitting scheme and the paragraph splitting scheme are combined to perform paragraph splitting, resulting in the split paragraph data.
[0115] In one feasible implementation, in order to accurately split the content type, the step of generating a content splitting scheme based on the content type information includes: determining the paragraph type classification based on the content type information, wherein the paragraph type classification includes body text type, table of contents type and table type; splitting the paragraph data into paragraphs based on the paragraph type classification to obtain the content splitting scheme.
[0116] It should be noted that paragraph types should be determined, such as body text, table of contents, tables, etc. For example... Figure 3 As shown, when a table of contents is identified, the ellipsis representing the relationship between the title and page number needs to be processed. For the special processing of the table of contents, based on content detection, if the current line belongs to the table of contents, the page numbers in the table of contents are separated, and meaningless dashes, ellipses, and other characters in the content are deleted. For tables, the dividing lines in the middle are removed.
[0117] In one feasible implementation, the step of generating a paragraph splitting scheme based on the paragraph coverage information in order to split overlapping content includes: determining overlapping paragraphs based on the paragraph coverage information; performing paragraph analysis on the paragraph data based on the overlapping paragraphs to obtain re-segmentation information of the overlapping paragraphs; and determining a paragraph splitting scheme based on the re-segmentation information.
[0118] It should be understood that for the generated paragraph data, it is necessary to check for overlaps or intersections, which may occur in complex layouts, such as when there is text next to a chart or table. Figure 4 In the image, E is an image, but paragraphs A, B, C, and D are arranged in an L-shape around image E. Therefore, the paragraphs should be split.
[0119] In practice, if overlapping paragraphs are detected, the paragraphs need to be split according to the specific layout logic to ensure that the content of each paragraph is continuous and does not overlap.
[0120] Step S303: Translate the split paragraph data and display it according to the target layout scheme of the target file.
[0121] It should be noted that after obtaining the segmented data, the segmented data is input into the target translation software or translation process to translate into the target language, and the translation results of each segment are obtained. Then, the translation results of each segment are displayed according to the original layout of the target file, that is, the translated text of each segment is placed according to the target layout scheme.
[0122] This embodiment analyzes the paragraph data using the large language model to obtain paragraph coverage information and content type information; it then segments the paragraphs based on the content type information and paragraph coverage information to obtain segmented paragraph data; finally, it translates the segmented paragraph data and displays it according to the target layout scheme of the target file. In this way, it achieves further segmentation of overlapping text and special types of text, thereby optimizing the translation results and content.
[0123] This application also provides a document translation device; please refer to [reference needed]. Figure 5 The document translation device includes:
[0124] The text extraction module 10 is used to extract text block data of text objects in the target file in response to file translation instructions, using a large language model.
[0125] The paragraph generation module 20 is used to generate paragraph data based on the text block data and the text line data of the text object.
[0126] The segmentation and display module 30 is used to segment paragraphs based on the paragraph coverage information and content type information of the paragraph data, translate the segmented paragraph data, and display it according to the target layout scheme of the target file.
[0127] This embodiment, in response to a file translation command, extracts text block data of text objects in the target file using a large language model; generates paragraph data based on the text block data and the text line data of the text objects; splits the paragraphs based on the paragraph coverage information and content type information of the paragraph data; translates the split paragraph data; and displays it according to the target file's target layout scheme. In this way, it achieves text and paragraph recognition in PDF files using a large language model, and then optimizes paragraph splitting based on paragraph coverage and content type. This allows the final optimized paragraphs to be translated and displayed according to the original file's layout, ensuring automatic paragraph recognition and translation within the original file's layout. This simplifies the translation process while improving the reading experience.
[0128] In one embodiment, the text extraction module 10 is further configured to, in response to a file translation instruction, determine a target file; import the target file into a large language model to obtain the original content and coordinate information of the target file; and determine the text block data of the text object in the target file based on the original content and the coordinate information.
[0129] In one embodiment, the paragraph generation module 20 is further configured to determine the text line data of the text object based on the text block data; and generate paragraph data of the text object based on the text line data.
[0130] In one embodiment, the paragraph generation module 20 is further configured to determine text block classification data based on the text block data; determine text line break information based on the text block classification data; and generate text line data based on the text line break information.
[0131] In one embodiment, the paragraph generation module 20 is further configured to determine line content information and line font size information based on the text line break information; and generate text line data based on the line content information and line font size information.
[0132] In one embodiment, the paragraph generation module 20 is further configured to determine line vertical distance information, line beginning information, and line ending information based on the text line data; determine line spacing based on the line vertical distance; and generate paragraph data of the text object based on the line spacing, line beginning information, and line ending information.
[0133] In one embodiment, the paragraph generation module 20 is further configured to determine the line beginning distance based on the line beginning information; determine the line ending distance based on the line ending information; determine the adjacent line classification information of the text object based on the line beginning distance, the line ending distance, and the line spacing; and generate paragraph data of the text object based on the adjacent line classification information.
[0134] In one embodiment, the split display module 30 is further configured to analyze the paragraph data through the large language model to obtain paragraph coverage information and content type information; split the paragraphs according to the content type information and the paragraph coverage information to obtain split paragraph data; translate the split paragraph data and display it according to the target layout scheme of the target file.
[0135] In one embodiment, the split display module 30 is further configured to perform text recognition on the paragraph data through the large language model to obtain text overlap information; determine paragraph coverage information based on the text overlap information; and perform semantic analysis on the paragraph data through the large language model to obtain content type information.
[0136] In one embodiment, the split display module 30 is further configured to generate a content splitting scheme based on the content type information; generate a paragraph splitting scheme based on the paragraph coverage information; and split the paragraphs according to the content splitting scheme and the paragraph splitting scheme to obtain the split paragraph data.
[0137] In one embodiment, the split display module 30 is further configured to determine paragraph type classification based on the content type information, wherein the paragraph type classification includes body text type, table of contents type and table type; and to split the paragraph data into paragraphs based on the paragraph type classification to obtain a content splitting scheme.
[0138] In one embodiment, the split display module 30 is further configured to determine overlapping paragraphs based on the paragraph coverage information; perform paragraph analysis on the paragraph data based on the overlapping paragraphs to obtain re-segmentation information of the overlapping paragraphs; and determine a paragraph splitting scheme based on the re-segmentation information.
[0139] The document translation device provided in this application, employing the document translation method described in the above embodiments, can solve the cumbersome technical problem of current PDF translation requiring manual identification, segmentation, and pasting of content into translation software. Compared with the prior art, the beneficial effects of the document translation device provided in this application are the same as those of the document translation method provided in the above embodiments, and other technical features in the document translation device are the same as those disclosed in the methods of the above embodiments, and will not be repeated here.
[0140] This application provides a document translation device, which includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, which are executed by the at least one processor to enable the at least one processor to perform the document translation method in Embodiment 1 above.
[0141] The following is for reference. Figure 6The diagram illustrates a structural schematic of a document translation device suitable for implementing embodiments of this application. The document translation device in this application may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Portable Application Description), PMPs (Portable Media Players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and fixed terminals such as digital TVs and desktop computers. Figure 5 The document translation device shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.
[0142] like Figure 6 As shown, the document translation device may include a processing unit 1001 (e.g., a central processing unit, a graphics processing unit, etc.), which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 1002 or a program loaded from a storage device 1003 into a random access memory (RAM) 1004. The RAM 1004 also stores various programs and data required for the operation of the document translation device. The processing unit 1001, ROM 1002, and RAM 1004 are interconnected via a bus 1005. An input / output (I / O) interface 1006 is also connected to the bus. Typically, the following systems can be connected to the I / O interface 1006: input devices 1007 including, for example, a touchscreen, touchpad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 1008 including, for example, a liquid crystal display (LCD), speaker, vibrator, etc.; storage devices 1003 including, for example, magnetic tape, hard disk, etc.; and communication devices 1009. The communication device 1009 allows the document translation device to communicate wirelessly or wiredly with other devices to exchange data. Although the figure shows document translation devices with various systems, it should be understood that implementing or having all of the systems shown is not required. More or fewer systems may be implemented alternatively.
[0143] Specifically, according to the embodiments disclosed in this application, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments disclosed in this application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device, or installed from storage device 1003, or installed from ROM 1002. When the computer program is executed by processing device 1001, it performs the functions defined in the methods of the embodiments disclosed in this application.
[0144] The document translation device provided in this application, employing the document translation method described in the above embodiments, solves the cumbersome technical problem of current PDF translation requiring manual identification, segmentation, and pasting of content into translation software. Compared with the prior art, the beneficial effects of the document translation device provided in this application are the same as those of the document translation method provided in the above embodiments, and other technical features of the document translation device are the same as those disclosed in the previous embodiment method, and will not be repeated here.
[0145] It should be understood that the various parts disclosed in this application can be implemented using hardware, software, firmware, or a combination thereof. In the description of the above embodiments, specific features, structures, materials, or characteristics can be combined in any suitable manner in one or more embodiments or examples.
[0146] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
[0147] This application provides a computer-readable storage medium having computer-readable program instructions (i.e., a computer program) stored thereon, the computer-readable program instructions being used to execute the file translation method described in the above embodiments.
[0148] The computer-readable storage medium provided in this application may be, for example, a USB flash drive, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In this embodiment, the computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, system, or device. The program code contained on the computer-readable storage medium may be transmitted using any suitable medium, including but not limited to: wires, optical cables, RF (Radio Frequency), etc., or any suitable combination thereof.
[0149] The aforementioned computer-readable storage medium may be included in the document translation device; or it may exist independently and not be assembled into the document translation device.
[0150] The aforementioned computer-readable storage medium carries one or more programs. When the aforementioned one or more programs are executed by a file translation device, the file translation device: in response to a file translation instruction, extracts text block data of text objects in a target file using a large language model; generates paragraph data based on the text block data and the text line data of the text objects; splits paragraphs based on the paragraph coverage information and content type information of the paragraph data; translates the split paragraph data; and displays it according to the target layout scheme of the target file.
[0151] Computer program code for performing the operations of this application can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0152] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0153] The modules described in the embodiments of this application can be implemented in software or hardware. The names of the modules do not necessarily limit the functionality of the unit itself.
[0154] The readable storage medium provided in this application is a computer-readable storage medium that stores computer-readable program instructions (i.e., computer programs) for executing the above-described document translation method. This solves the cumbersome technical problem of current PDF translation methods, which require manual identification, segmentation, and pasting of content into translation software. Compared with the prior art, the beneficial effects of the computer-readable storage medium provided in this application are the same as those of the document translation method provided in the above embodiments, and will not be elaborated upon here.
[0155] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the document translation method described above.
[0156] The computer program product provided in this application can solve the cumbersome technical problem of current PDF translation, which requires manual identification, segmentation, and pasting of content into translation software. Compared with the prior art, the beneficial effects of the computer program product provided in this application are the same as those of the document translation method provided in the above embodiments, and will not be repeated here.
[0157] The above description is only a part of the embodiments of this application and does not limit the patent scope of this application. All equivalent structural transformations made under the technical concept of this application and using the contents of the specification and drawings of this application, or direct / indirect applications in other related technical fields, are included in the patent protection scope of this application.
[0158] This invention discloses A1. A document translation method, the method comprising:
[0159] In response to a file translation command, extract text block data of text objects in the target file using a large language model;
[0160] Paragraph data is generated based on the text block data and the text line data of the text object;
[0161] The paragraphs are split based on the paragraph coverage information and content type information of the paragraph data. The split paragraph data is then translated and displayed according to the target layout scheme of the target file.
[0162] A2. As described in A1, the step of extracting text block data of text objects in the target file using a large language model in response to a file translation instruction includes:
[0163] In response to the file translation instruction, determine the target file;
[0164] The target file is imported into the large language model to obtain the original content and coordinate information of the target file;
[0165] The text block data of the text object in the target file is determined based on the original content and the coordinate information.
[0166] A3. As described in A1, the step of generating paragraph data based on the text block data and the text line data of the text object includes:
[0167] Determine the text line data of the text object based on the text block data;
[0168] The paragraph data of the text object is generated based on the text line data.
[0169] A4. As described in A3, the step of determining the text line data of the text object based on the text block data includes:
[0170] Determine text block classification data based on the text block data;
[0171] The text line break information is determined based on the text block classification data;
[0172] Text line data is generated based on the text line break information.
[0173] A5. As described in A4, the step of generating text line data based on the text line break information includes:
[0174] Determine the line content information and line font size information based on the text line break information;
[0175] Text line data is generated based on the line content information and line font size information.
[0176] A6. As described in A3, the step of generating paragraph data of the text object based on the text line data includes:
[0177] Determine the vertical distance information, beginning information, and end information of the lines based on the text line data;
[0178] The row spacing is determined based on the vertical distance between rows;
[0179] The paragraph data of the text object is generated based on the line spacing, line start information, and line end information.
[0180] A7. As described in A6, the step of generating paragraph data of the text object based on the line spacing, line beginning information, and line ending information includes:
[0181] Determine the line beginning difference distance based on the line beginning information;
[0182] The line tail difference distance is determined based on the line tail information;
[0183] The adjacent line classification information of the text object is determined based on the line beginning difference distance, the line ending difference distance, and the line spacing.
[0184] The paragraph data of the text object is generated based on the adjacent line classification information.
[0185] A8. As described in A1, the steps of splitting paragraphs based on the paragraph coverage information and content type information of the paragraph data, translating the split paragraph data, and displaying it according to the target layout scheme of the target file include:
[0186] The paragraph data is analyzed using the large language model to obtain paragraph coverage information and content type information;
[0187] The paragraphs are split according to the content type information and the paragraph coverage information to obtain the split paragraph data;
[0188] The split paragraph data is translated and displayed according to the target layout scheme of the target file.
[0189] A9. As described in A8, the step of analyzing the paragraph data using the large language model to obtain paragraph coverage information and content type information includes:
[0190] Text recognition is performed on the paragraph data obtained by the large language model to obtain text overlap information;
[0191] Determine paragraph coverage information based on the text overlap information;
[0192] The paragraph data is semantically analyzed using the large language model to obtain content type information.
[0193] A10. As described in A8, the step of splitting paragraphs based on the content type information and the paragraph coverage information to obtain the split paragraph data includes:
[0194] Generate a content splitting scheme based on the content type information;
[0195] A paragraph splitting scheme is generated based on the paragraph coverage information;
[0196] The paragraphs are split according to the content splitting scheme and the paragraph splitting scheme to obtain the split paragraph data.
[0197] A11. As described in A10, the step of generating a content splitting scheme based on the content type information includes:
[0198] The paragraph type classification is determined based on the content type information, and the paragraph type classification includes body text type, table of contents type, and table type.
[0199] The paragraph data is split into paragraphs based on the paragraph type classification to obtain a content splitting scheme.
[0200] A12. As described in A10, the step of generating a paragraph splitting scheme based on the paragraph coverage information includes:
[0201] The overlapping paragraphs are determined based on the paragraph coverage information;
[0202] Based on the overlapping paragraphs, paragraph analysis is performed on the paragraph data to obtain the re-segmentation information of the overlapping paragraphs;
[0203] The paragraph splitting scheme is determined based on the re-segmentation information.
[0204] The present invention also discloses B13. A document translation apparatus, the apparatus comprising:
[0205] The text extraction module is used to extract text block data of text objects in the target file in response to file translation commands, using a large language model.
[0206] The paragraph generation module is used to generate paragraph data based on the text block data and the text line data of the text object;
[0207] The segmentation and display module is used to segment paragraphs based on the paragraph coverage information and content type information of the paragraph data, translate the segmented paragraph data, and display it according to the target layout scheme of the target file.
[0208] B14. The document translation apparatus as described in B13, wherein the text extraction module is further configured to, in response to a document translation instruction, determine a target file; import the target file into a large language model to obtain the original content and coordinate information of the target file; and determine text block data of text objects in the target file based on the original content and the coordinate information.
[0209] B15. The document translation apparatus as described in B13, wherein the paragraph generation module is further configured to determine the text line data of the text object based on the text block data; and generate paragraph data of the text object based on the text line data.
[0210] B16. The document translation apparatus as described in B15, wherein the paragraph generation module is further configured to determine text block classification data based on the text block data; determine text line break information based on the text block classification data; and generate text line data based on the text line break information.
[0211] B17. In the document translation apparatus as described in B16, the paragraph generation module is further configured to determine line content information and line font size information based on the text line break information; and generate text line data based on the line content information and line font size information.
[0212] The present invention also discloses C18. A document translation device, the device comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program being configured to implement the steps of the document translation method as described above.
[0213] The present invention also discloses D19. A storage medium, which is a computer-readable storage medium, wherein a computer program is stored on the storage medium, and the computer program, when executed by a processor, implements the steps of the file translation method described above.
[0214] The present invention also discloses E20. A computer program product comprising a computer program that, when executed by a processor, implements the steps of the file translation method described above.
Claims
1. A method of translating a file, characterized by, The method includes: In response to a file translation command, extract text block data of text objects in the target file using a large language model; Paragraph data is generated based on the text block data and the text line data of the text object; The paragraphs are split based on the paragraph coverage information and content type information of the paragraph data. The split paragraph data is then translated and displayed according to the target layout scheme of the target file.
2. The method of claim 1, wherein, The step of extracting text block data of text objects in a target file using a large language model in response to a file translation instruction includes: In response to the file translation instruction, determine the target file; The target file is imported into the large language model to obtain the original content and coordinate information of the target file; The text block data of the text object in the target file is determined based on the original content and the coordinate information.
3. The method of claim 1, wherein, The step of generating paragraph data based on the text block data and the text line data of the text object includes: Determine the text line data of the text object based on the text block data; The paragraph data of the text object is generated based on the text line data.
4. The method of claim 3, wherein, The step of determining the text line data of the text object based on the text block data includes: Determine text block classification data based on the text block data; The text line break information is determined based on the text block classification data; Text line data is generated based on the text line break information.
5. The method of claim 4, wherein, The step of generating text line data based on the text line break information includes: Determine the line content information and line font size information based on the text line break information; Text line data is generated based on the line content information and line font size information.
6. The method of claim 3, wherein, The step of generating paragraph data for the text object based on the text line data includes: Determine the vertical distance information, beginning information, and end information of the lines based on the text line data; The row spacing is determined based on the vertical distance between rows; The paragraph data of the text object is generated based on the line spacing, line start information, and line end information.
7. A document translation apparatus characterized by comprising: The device includes: The text extraction module is used to extract text block data of text objects in the target file in response to file translation commands, using a large language model. The paragraph generation module is used to generate paragraph data based on the text block data and the text line data of the text object; The segmentation and display module is used to segment paragraphs based on the paragraph coverage information and content type information of the paragraph data, translate the segmented paragraph data, and display it according to the target layout scheme of the target file.
8. A document translation device, characterized by, The device includes: a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program being configured to implement the steps of the document translation method as described in any one of claims 1 to 6.
9. A storage medium, characterized in that, The storage medium is a computer-readable storage medium, and a computer program is stored on the storage medium. When the computer program is executed by a processor, it implements the steps of the document translation method as described in any one of claims 1 to 6.
10. A computer program product, characterized in that, The computer program product includes a computer program that, when executed by a processor, implements the steps of the document translation method as described in any one of claims 1 to 6.