A time domain information parsing and processing method based on a large language model
By constructing a large language model vocabulary and a temporal encoder, and combining it with self-supervised learning, we have achieved efficient parsing and processing of temporal information using a large language model. This solves the problem of time-consuming and labor-intensive multimodal data processing in existing technologies and is applicable to multiple practical application fields.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANDONG INSPUR SCI RES INST CO LTD
- Filing Date
- 2024-05-28
- Publication Date
- 2026-06-19
AI Technical Summary
Large language models face inherent obstacles when processing non-textual modal information, especially temporal information. They require fine-tuning model parameters or building complex intermodal bridges, which is time-consuming, labor-intensive, and requires a large amount of labeled data.
By constructing a large language model vocabulary, a temporal encoder, a quantization dictionary, and a temporal decoder, and leveraging the contextual learning capabilities of the large language model combined with self-supervised learning, the parsing and processing of temporal information is achieved, including the mapping between the language encoder and the text projector, and the discretized representation of the quantization dictionary.
It can efficiently process time-domain data without modifying large model parameters, reducing learning costs and improving production efficiency. It is suitable for fields such as time series signal prediction, IoT signal processing, and medical and health monitoring.
Smart Images

Figure CN118503680B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of large language model technology, and in particular to a method for temporal information parsing and processing based on large language models. Background Technology
[0002] Today, large language models (LLMs) are developing rapidly and being used more and more widely. With their powerful contextual understanding and generation capabilities, they have achieved remarkable results in the field of natural language processing.
[0003] The development of large language models began with basic language research in the 20th century. After early attempts at simple models, the field experienced explosive growth with the rise of the deep learning revolution in the early 21st century. Since 2018, breakthroughs in pre-training techniques have ushered in a new era for large language models. These models, pre-trained on massive amounts of text data, have learned to understand and generate the complex nuances of human language, with the number of parameters increasing from millions to hundreds of billions or even trillions, significantly improving the performance of natural language processing tasks. This has enabled large language models to be applied to fields such as writing, dialogue, and translation, profoundly changing human-computer interaction patterns and daily life. Today, large language models have become a core driving force in the field of artificial intelligence, continuously promoting technological innovation and social change, foreshadowing the arrival of a highly intelligent era.
[0004] Time-domain information reveals the dynamic behavior of signals over time, such as trends, periodicity, and transient phenomena. Its processing needs are widespread in many critical fields, such as time-series signal prediction in industrial control, sensor signal processing in the Internet of Things (IoT), and physiological monitoring in the healthcare field. Accurate and efficient analysis and processing are of great value for improving system performance and optimizing decision-making processes. Traditionally, time-domain information technology encompasses everything from basic signal acquisition, preprocessing, and feature extraction to in-depth time-domain analysis methods, such as time-domain statistical analysis, time-domain sequence analysis, and short-time analysis techniques. Today, the rise of large-scale models is bringing new capabilities to the field of time-domain information processing. The powerful logic and generalization capabilities of large-scale models can provide more user-friendly interaction methods and more universal approaches to time-domain information analysis and processing.
[0005] However, large language models were initially designed primarily for text data, presenting inherent challenges in processing non-textual modal information, such as temporal data. Enabling large language models to understand and process non-textual modal data typically requires fine-tuning model parameters or constructing complex inter-modal bridges. This process is not only time-consuming and labor-intensive but also requires substantial amounts of labeled data, posing a significant obstacle in practical applications. This is especially true for temporal data, whose unique dynamic characteristics and potential complex patterns make this task even more challenging.
[0006] Based on the above, this invention proposes a time-domain information parsing and processing method based on a large language model, aiming to open up a low-threshold and high-efficiency information processing path, and endow it with the ability to process time-domain information without modifying the architecture and parameters of the large language model. Summary of the Invention
[0007] To overcome the shortcomings of existing technologies, this invention provides a simple and efficient method for temporal information parsing and processing based on a large language model.
[0008] This invention is achieved through the following technical solution:
[0009] A method for temporal information parsing and processing based on a large language model, characterized by the following steps:
[0010] Step S01: Construct a large language model vocabulary by randomly combining uncommon texts, such as rare characters and characters from various countries; randomly sample and combine the text words in the large language model vocabulary, input them into the language encoder to obtain text encoding, and input them into this projector; after the text projector further transforms and processes the text features, it obtains the projected text encoding used to map temporal features.
[0011] Step S02: Establish a time-domain information autoencoder, including a time-domain encoder, a quantization dictionary, and a time-domain decoder;
[0012] Temporal features are obtained by dimensionality reduction and encoding of temporal information through a temporal encoder.
[0013] The quantization dictionary receives the temporal features output by the temporal encoder, maps them to similar projected text codes, and performs self-supervised learning to discretize the temporal information.
[0014] Step S03: Input the prompt words into the large language model. The large language model, combined with the quantization dictionary, determines whether the prompt words contain temporal information.
[0015] If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output.
[0016] If no time-domain information is included, the processing result will be output directly.
[0017] In step S01, the language encoder uses a pre-trained Transformer network, and the text projector is composed of a multilayer perceptron (MLP network) to map the language modality to the temporal information modality.
[0018] In the vocabulary of the large language model, uncommon texts are randomly selected and combined using one, two, or three tokens to form longer phrases that express temporal signals. Although these texts cannot be interpreted by humans, they can be understood by the model using the contextual capabilities of the large language model.
[0019] In step S02, both the time-domain encoder and the time-domain decoder use Transformer networks;
[0020] Before encoding, the time-domain information is first segmented according to a user-defined interval. The segmentation unit is determined by the user based on the measurement unit of the time-domain information and the specific task. For example, if predicting hourly weather time series information, the segmentation unit is hours. After segmentation, the data is normalized and then input into the time-domain encoder.
[0021] The time-domain encoder performs dimensionality reduction and encoding of time-domain information;
[0022] The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding.
[0023] The quantization dictionary contains 4096 entries, each of which is a key-value pair. The value vectors in the quantization dictionary are obtained through self-supervised learning and are not entirely constructed from projected text encoding. When training the value vectors in the quantization dictionary through self-supervised learning, the negative log-likelihood function between the value vectors in the quantization dictionary and the projected text encoding is used as the loss function.
[0024] In step S02, when performing self-supervised learning training on the value vector in the quantization dictionary, the mean absolute error is calculated between the input and output of the time-domain information.
[0025] After training, the time-domain information is encoded to obtain the trained quantization dictionary.
[0026] A temporal information parsing and processing device based on a large language model is characterized by comprising a large language model vocabulary module, a language encoder, a text projector, a large language model module, a temporal encoder, a quantization dictionary, and a temporal decoder.
[0027] The large language model vocabulary module is responsible for constructing a large language model vocabulary by randomly combining uncommon texts, providing a source of vocabulary when converting time-domain information into language information.
[0028] The language encoder uses a pre-trained Transformer network, which is responsible for encoding combinations of text words randomly selected from the vocabulary of a large language model to obtain text encoding.
[0029] The text projector is composed of a multilayer perceptron (MLP network) and is used to map the language modality to the temporal information modality to obtain the projected text encoding used to map temporal features;
[0030] Both the time-domain encoder and the time-domain decoder employ Transformer networks.
[0031] The time-domain encoder is responsible for dimensionality reduction and encoding of time-domain information to obtain time-domain features;
[0032] The quantization dictionary is used to receive the temporal features output by the temporal encoder, map them to similar projected text codes, and perform self-supervised learning to discretize the temporal information.
[0033] The time-domain decoder is used to receive the quantized time-domain information and decode it into the original time-domain information.
[0034] The large language model module is used to parse and process temporal information using the context learning capabilities of the large language model; after receiving a prompt word, the large language model combines the quantization dictionary to determine whether the prompt word contains temporal information:
[0035] If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output.
[0036] If no time-domain information is included, the processing result will be output directly.
[0037] It also includes a time-domain information processing module, which is responsible for dividing the time-domain information into custom intervals before encoding. The division unit is determined by the user based on the time-domain information measurement unit and the specific task. After division, the data is normalized and then input into the time-domain encoder.
[0038] The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding.
[0039] A temporal information parsing and processing device based on a large language model is characterized by comprising a memory and a processor; the memory is used to store a computer program, and the processor is used to execute the computer program to implement the method steps described above.
[0040] A readable storage medium, characterized in that: a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, it implements the method steps described above.
[0041] The beneficial effects of this invention are: the time-domain information parsing and processing method based on a large language model can process and predict time-domain data without modifying the parameters of the large model, solving the problem that traditional multimodal large language models require a large amount of data for training, reducing learning costs and improving production efficiency. Attached Figure Description
[0042] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0043] Appendix Figure 1 This is a schematic diagram of the self-supervised learning method of the time-domain information autoencoder of the present invention.
[0044] Appendix Figure 2 This is a schematic diagram of the temporal information parsing and processing flow based on a large language model according to the present invention. Detailed Implementation
[0045] To enable those skilled in the art to better understand the technical solutions of this invention, the technical solutions in the embodiments of this invention will be clearly and completely described below in conjunction with the embodiments of this invention. Obviously, the described embodiments are merely some embodiments of this invention, and not all embodiments. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this invention.
[0046] This time-domain information parsing and processing method based on a large language model includes the following steps:
[0047] Step S01: Construct a large language model vocabulary by randomly combining uncommon texts, such as rare characters and characters from various countries; randomly sample and combine the text words in the large language model vocabulary, input them into the language encoder to obtain text encoding, and input them into this projector; after the text projector further transforms and processes the text features, it obtains the projected text encoding used to map temporal features.
[0048] Step S02: Establish a time-domain information autoencoder, including a time-domain encoder, a quantization dictionary, and a time-domain decoder;
[0049] Temporal features are obtained by dimensionality reduction and encoding of temporal information through a temporal encoder.
[0050] The quantization dictionary receives the temporal features output by the temporal encoder, maps them to similar projected text codes, and performs self-supervised learning to discretize the temporal information.
[0051] Step S03: Input the prompt words into the large language model. The large language model, combined with the quantization dictionary, determines whether the prompt words contain temporal information.
[0052] If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output.
[0053] If no time-domain information is included, the processing result will be output directly.
[0054] In step S01, the language encoder uses a pre-trained Transformer network, and the text projector is composed of a multilayer perceptron (MLP network) to map the language modality to the temporal information modality.
[0055] In the vocabulary of the large language model, uncommon texts are randomly selected and combined using one, two, or three tokens to form longer phrases that express temporal signals. Although these texts cannot be interpreted by humans, they can be understood by the model using the contextual capabilities of the large language model.
[0056] In step S02, both the time-domain encoder and the time-domain decoder use Transformer networks;
[0057] Before encoding, the time-domain information is first segmented according to a user-defined interval. The segmentation unit is determined by the user based on the measurement unit of the time-domain information and the specific task. For example, if predicting hourly weather time series information, the segmentation unit is hours. After segmentation, the data is normalized and then input into the time-domain encoder.
[0058] The time-domain encoder performs dimensionality reduction and encoding of time-domain information;
[0059] The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding.
[0060] The quantization dictionary contains 4096 entries, each of which is a key-value pair. The value vectors in the quantization dictionary are obtained through self-supervised learning and are not entirely constructed from projected text encoding. When training the value vectors in the quantization dictionary through self-supervised learning, the negative log-likelihood function between the value vectors in the quantization dictionary and the projected text encoding is used as the loss function.
[0061] In step S02, when performing self-supervised learning training on the value vector in the quantization dictionary, the mean absolute error is calculated between the input and output of the time-domain information.
[0062] After training, the time-domain information is encoded to obtain the trained quantization dictionary.
[0063] This paper leverages the contextual learning capabilities of a large language model to process temporal information for specific tasks, extending the implementation of prompt word engineering methods. The prompt word consists of three parts: task description, example (containing quantized temporal information), and task objective. The large language model outputs specific text based on the example and task objective to achieve the task goal.
[0064] The following explanation uses a classification task as an example:
[0065] The prompt words will be input into the large language model;
[0066] Task Description: Given the following pairs of data, output whether they belong to category A or category B.
[0067] Example:
[0068] Input: (Quantized time-domain information 1), Output: Category A;
[0069] Input: (Quantized time-domain information 2), Output: Category B;
[0070] Task objective: Input: (Target quantized time-domain information), Output:
[0071] After receiving the above prompts, the large language model will output results for a specific category based on the example content and task objective, thus achieving the purpose of classification tasks. Other tasks are similar.
[0072] The temporal information parsing and processing device based on a large language model includes a large language model vocabulary module, a language encoder, a text projector, a large language model module, a temporal encoder, a quantization dictionary, and a temporal decoder.
[0073] The large language model vocabulary module is responsible for constructing a large language model vocabulary by randomly combining uncommon texts, providing a source of vocabulary when converting time-domain information into language information.
[0074] The language encoder uses a pre-trained Transformer network, which is responsible for encoding combinations of text words randomly selected from the vocabulary of a large language model to obtain text encoding.
[0075] The text projector is composed of a multilayer perceptron (MLP network) and is used to map the language modality to the temporal information modality to obtain the projected text encoding used to map temporal features;
[0076] Both the time-domain encoder and the time-domain decoder employ Transformer networks.
[0077] The time-domain encoder is responsible for dimensionality reduction and encoding of time-domain information to obtain time-domain features;
[0078] The quantization dictionary is used to receive the temporal features output by the temporal encoder, map them to similar projected text codes, and perform self-supervised learning to discretize the temporal information.
[0079] The time-domain decoder is used to receive the quantized time-domain information and decode it into the original time-domain information.
[0080] The large language model module is used to parse and process temporal information using the context learning capabilities of the large language model; after receiving a prompt word, the large language model combines the quantization dictionary to determine whether the prompt word contains temporal information:
[0081] If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output.
[0082] If no time-domain information is included, the processing result will be output directly.
[0083] It also includes a time-domain information processing module, which is responsible for dividing the time-domain information into custom intervals before encoding. The division unit is determined by the user based on the time-domain information measurement unit and the specific task. After division, the data is normalized and then input into the time-domain encoder.
[0084] The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding.
[0085] The time-domain information parsing and processing device based on a large language model includes a memory and a processor; the memory is used to store computer programs, and the processor is used to execute the computer programs to implement the method steps described above.
[0086] The readable storage medium stores a computer program that, when executed by a processor, implements the method steps described above.
[0087] Compared with existing technologies, this temporal information parsing and processing method based on a large language model has the following characteristics:
[0088] First, it leverages the contextual learning capabilities of large language models to directly understand and process time-domain data without needing to fine-tune the model parameters.
[0089] Second, through self-supervised learning strategies and word embedding techniques of language models, temporal information can be effectively mapped into language tokens, enabling large language models to directly parse and predict.
[0090] Third, a dictionary for discrete representation of time-domain information was constructed, allowing any time-domain information to be represented as an element in the dictionary; combined with the quantization encoding dictionary of the autoencoder, it is not only possible to discretize time-domain information, but also to decode it back to the normal state through the decoder.
[0091] Fourth, it provides a method for achieving multimodal information association without the need for extensive data training, which is applicable to fields such as time series signal prediction, IoT signal processing, and medical and health monitoring.
[0092] Fifth, by applying deep learning libraries, it is possible to achieve interactive language understanding and processing of temporal information without modifying the parameters of large models.
[0093] The embodiments described above are merely one specific implementation of the present invention. Ordinary changes and substitutions made by those skilled in the art within the scope of the technical solution of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for time domain information analysis and processing based on a large language model, characterized in that: Includes the following steps: Step S01: Construct a large language model vocabulary by randomly combining uncommon texts; randomly sample and combine the text words in the large language model vocabulary, input them into the language encoder to obtain the text encoding, and input it into the projector; after the text projector further transforms and processes the text features, it obtains the projected text encoding used to map the temporal features. Step S02: Establish a time-domain information autoencoder, including a time-domain encoder, a quantization dictionary, and a time-domain decoder; Temporal features are obtained by dimensionality reduction and encoding of temporal information through a temporal encoder. The quantization dictionary receives the temporal features output by the temporal encoder, maps them to similar projected text codes, and performs self-supervised learning to discretize the temporal information. The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding. Step S03: Input the prompt words into the large language model. The large language model, combined with the quantization dictionary, determines whether the prompt words contain temporal information. If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output. If no time-domain information is included, the processing result will be output directly.
2. The method of claim 1, wherein the method is based on a large language model. In step S01, the language encoder uses a pre-trained Transformer network, and the text projector is composed of a multilayer perceptron, used to map the language modality to the temporal information modality. In the vocabulary of the large language model, uncommon texts are randomly selected and combined with one, two, or three tags to form longer phrases for expressing time-domain signals.
3. The method of claim 1, wherein the method is based on a large language model. In step S02, both the time-domain encoder and the time-domain decoder use Transformer networks; Before encoding, the time-domain information is first divided into custom intervals. The division unit is determined by the user based on the measurement unit of the time-domain information and the specific task. After division, the data is normalized and then input into the time-domain encoder. The time-domain encoder performs dimensionality reduction and encoding on time-domain information.
4. The method of claim 3, wherein the method further comprises: The quantization dictionary contains 4096 entries, each of which is a key-value pair. The value vectors in the quantization dictionary are obtained through self-supervised learning and are not entirely constructed from projected text encoding. When training the value vectors in the quantization dictionary through self-supervised learning, the negative log-likelihood function between the value vectors in the quantization dictionary and the projected text encoding is used as the loss function.
5. The method of claim 4, wherein the method further comprises: In step S02, when performing self-supervised learning training on the value vector in the quantization dictionary, the mean absolute error is calculated between the input and output of the time-domain information. After training, the time-domain information is encoded to obtain the trained quantization dictionary.
6. A large language model-based time domain information parsing and processing apparatus, characterized in that: It includes a large language model vocabulary module, a language encoder, a text projector, a large language model module, a temporal encoder, a quantization dictionary, and a temporal decoder. The large language model vocabulary module is responsible for constructing a large language model vocabulary by randomly combining uncommon texts, providing a source of vocabulary when converting time-domain information into language information. The language encoder uses a pre-trained Transformer network, which is responsible for encoding combinations of text words randomly selected from the vocabulary of a large language model to obtain text encoding. The text projector is composed of a multilayer perceptron and is used to map language modalities to temporal information modalities to obtain projected text encodings for mapping temporal features. Both the time-domain encoder and the time-domain decoder employ Transformer networks. The time-domain encoder is responsible for dimensionality reduction and encoding of time-domain information to obtain time-domain features; The quantization dictionary is used to receive the temporal features output by the temporal encoder, map them to similar projected text codes, and perform self-supervised learning to discretize the temporal information. The quantization dictionary consists of key-value pairs, where the key is a word number and the value is a vector of length 1024. The key-value pairs map continuous temporal features to specific entries. The mapping method is to calculate the cosine distance between each entry and the temporal feature, and the largest value is the mapping result. The entries are words in the vocabulary of the large language model guided by projected text encoding. The time-domain decoder is used to receive the quantized time-domain information and decode it into the original time-domain information. The large language model module is used to parse and process temporal information using the context learning capabilities of the large language model; after receiving a prompt word, the large language model combines the quantization dictionary to determine whether the prompt word contains temporal information: If time-domain information is included, the quantized time-domain information is received through a time-domain decoder, decoded to the original time-domain information, and the processing result is output. If no time-domain information is included, the processing result will be output directly.
7. The large language model-based time-domain information parsing and processing apparatus according to claim 6, characterized in that: It also includes a time-domain information processing module, which is responsible for dividing the time-domain information into custom intervals before encoding. The division unit is determined by the user based on the time-domain information measurement unit and the specific task. After division, the data is normalized and then input into the time-domain encoder.
8. A large language model-based time domain information parsing and processing device, characterized in that: It includes a memory and a processor; the memory is used to store a computer program, and the processor is used to execute the computer program to implement the method as described in any one of claims 1 to 5.
9. A readable storage medium characterized by: The readable storage medium stores a computer program that, when executed by a processor, implements the method as described in any one of claims 1 to 5.
Citation Information
Patent Citations
Adaptive bandwidth extension and apparatus for the same
CN105637583A
Improving classification between time-domain coding and frequency domain coding
CN106663441A