A method for VHF voice association of ship name suitable for vessel traffic management system

By employing dynamic ship name set preprocessing, multi-model collaborative transcription, and hot word optimization, combined with a multi-size sliding window matching algorithm, the problem of low ship name recognition efficiency in VHF communication was solved, achieving ship name association with high accuracy and low resource consumption, thus meeting the real-time and compatibility requirements of the VTS system.

CN120299455BActive Publication Date: 2026-06-19CSIC PRIDE (NANJING) ATMOSPHERIC & OCEANIC INFORMATION SYST CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CSIC PRIDE (NANJING) ATMOSPHERIC & OCEANIC INFORMATION SYST CO LTD
Filing Date
2025-03-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing VHF communication in vessel traffic management systems suffers from problems such as information omission, risk of misjudgment, inability to deeply integrate with VTS systems, and low efficiency in identifying ship names. In particular, it is difficult to accurately extract ship names in VHF call environments with noise interference, high technicality, and difficulty in utilizing context.

Method used

By employing dynamic ship name set preprocessing, multi-model collaborative transcription, and hot word optimization, combined with a multi-size sliding window matching algorithm, and narrowing the matching range through a pinyin code table and an approximate sound index table, highly accurate ship name association is achieved.

Benefits of technology

It significantly improved the accuracy of ship name recognition, reduced the impact of noise interference, reduced the consumption of computing resources, met real-time requirements, reduced system transformation costs, and improved the automation and intelligence level of the VTS system.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120299455B_ABST
    Figure CN120299455B_ABST
Patent Text Reader

Abstract

This invention discloses a method for associating ship names with VHF voice signals in a vessel traffic management system. The method includes: Step 1, establishing a pinyin code table and an approximate sound index table; Step 2, dynamic ship name set preprocessing; Step 3, multi-channel speech-to-text conversion; Step 4, multi-size sliding window matching; and Step 5, outputting the best-matching ship name. This invention, combined with the characteristics of a vessel traffic management system, solves the problems of low accuracy and difficulty in ship name association in VHF speech-to-text conversion through dynamic ship name set narrowing of the matching range, multi-model collaborative transcription and hot word optimization, dynamic ship name set preprocessing, and multi-size sliding window matching algorithms. It significantly improves the intelligence level of maritime supervision and features low resource consumption, high real-time performance, and high accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of ship traffic management technology, and in particular to a method for associating ship names with VHF voice signals in a ship traffic management system. Background Technology

[0002] Vessel Traffic Management (VTS) systems are primarily responsible for monitoring vessel movement within waterways and ensuring navigational safety. Their monitored areas are characterized by highly active economies, dense cargo throughput, numerous traffic participants, and significant risks of collisions and pollution; therefore, VTS systems require extremely high reliability. As a crucial component of VTS, VHF (Very High Frequency) radio communication is, according to relevant requirements, the primary means of real-time communication between vessels and between vessels and shore-based control centers, its importance being self-evident. However, despite the significant role of VHF communication in vessel traffic management, it also has the following drawbacks and limitations:

[0003] 1. VHF dialogues are voice content, which is easily affected by dialects, attention, noise, etc., leading to information omissions or misunderstandings. They may also be affected by the experience of the speakers, fatigue, or the pressure of emergency situations, posing a risk of inappropriate communication or misjudgment.

[0004] 2. VHF communication transmits voice content, which cannot be deeply integrated with the VTS information system, limiting its application in automated and intelligent maritime management and failing to effectively reduce the watchkeeping workload of watchkeepers. For example, it cannot simultaneously identify and display the corresponding vessel information in the VTS system based on the voice call content during a VHF call, still requiring watchkeepers to manually query vessel information, resulting in low efficiency.

[0005] To address the aforementioned drawbacks and limitations, accurate ship name text needs to be extracted from VHF speech. However, existing speech-to-text methods are inefficient for ship name recognition and association due to the following issues:

[0006] 1. Low accuracy of ship name speech-to-text transcription: VHF communication environment is subject to noise interference, and ship names often contain nautical-specific numerical pronunciations (such as "0" being pronounced as "dong"), making it difficult for general speech recognition models to accurately transcribe them.

[0007] 2. Wide range of ship name matching: Traditional methods require matching in the entire Chinese character database, which has high computational complexity and poor real-time performance.

[0008] 3. Interference from similar sounds: Crew members may confuse the pronunciation of ship names during VHF calls (such as "bei" and "bai"), leading to an increased mismatch rate.

[0009] 4. Difficulty in utilizing context: Current deep learning-based speech recognition models improve speech transcription accuracy by leveraging common text statistical patterns (content familiarity) and contextual relationships. However, due to the scarcity of public VHF communication channel resources, VHF calls are often brief, highly specialized conversations, and traffic control and VHF conversations between vessels on the same ship typically end within five sentences. Therefore, it is difficult to accurately extract ship name information using contextual information.

[0010] 5. Highly specialized: VHF calls contain many frequently used specialized terms, such as "Hello, traffic management" and "reporting to you." If these terms are not effectively utilized, it increases the difficulty of extracting ship names. Ship names themselves are not common everyday expressions, making it difficult for general speech models to accurately transcribe them.

[0011] Currently, there is no VHF voice-associated ship name method optimized for VHF communication scenarios. This invention addresses this problem by proposing a solution that integrates dynamic ship name set preprocessing, multi-model collaborative transcription, and intelligent matching, thus filling a technological gap in this field. Summary of the Invention

[0012] The technical problem to be solved by the present invention is to address the shortcomings of the prior art by providing a method for VHF voice association of ship names in a ship traffic management system. This method achieves high accuracy in ship name association by using dynamic ship name set to narrow the matching range, multi-model collaborative transcription and hot word optimization, dynamic ship name set preprocessing and multi-size sliding window matching algorithm.

[0013] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is as follows:

[0014] A method for associating ship names with VHF voice signals in a ship traffic management system includes the following steps.

[0015] Step 1: Establish a Pinyin code table and a similar sound index table.

[0016] Step 2, Dynamic vessel name set preprocessing: Receive and preprocess the dynamic vessel name set provided by the VTS system in real time.

[0017] Step 3, Multi-channel speech-to-text: Receive VHF call voice in real time, process it into text using multiple speech recognition models, and generate multiple output pinyin sequences of two types: real-time transcription and accurate transcription. Query the pinyin code table established in Step 1 and convert them into pinyin code sequences respectively.

[0018] Step 4: Multi-size sliding window matching: For each type of speech recognition model, different sized sliding windows are constructed, and each pinyin code sequence in each sliding window is traversed. The retrieval operation is performed through the approximate sound index table and the preprocessed dynamic ship name set to output multiple types of ship name sound matching sets. Each type of ship name sound matching set includes several ship name pinyin code entries.

[0019] Step 5: Output the best matching ship name: Take the ship name pinyin code entries from the various types of ship name sound matching sets output in Step 4, and output the best matching ship name pinyin code entries according to the weighted confidence.

[0020] Step 1, the method for establishing the Pinyin code table and the approximate sound index table, includes the following steps:

[0021] Step 1-1: Create a Pinyin code table: Assign numerical key values ​​to all common Pinyin characters starting from 0, forming a one-to-one Pinyin code table.

[0022] Step 1-2: Establish an index table of similar sounds: Based on the characteristics of VHF voice call services, group easily confused similar sounds to form an index table of similar sounds.

[0023] Step 2, the preprocessing method for the dynamic ship name set, includes the following steps:

[0024] Step 2-1: Construct a dynamic set of vessel names: The VTS system constructs a dynamic set of vessel names based on the coverage area of ​​the VHF system, which includes the names of vessels that may be able to conduct VHF calls.

[0025] Step 2-2: Construct a dynamic set of ship name pinyin entries: Traverse the dynamic ship name set and convert each ship name text into a ship name pinyin entry, thus forming a dynamic set of ship name pinyin entries; when the ship name contains characters with nautical numbers, multiple ship name pinyin entries are generated.

[0026] Steps 2-3: Construct a three-dimensional index array: Traverse all ship name pinyin entries in the dynamic ship name pinyin entry set, convert each ship name pinyin entry into a ship name pinyin code entry using the pinyin code table, and store it in a three-dimensional index array; where the first dimension of the three-dimensional index array is the pinyin code corresponding to a single pinyin; the second dimension is the length of the ship name pinyin; and the third dimension is the position of the pinyin in the ship name.

[0027] Step 3, the multi-channel speech-to-text method, includes the following steps:

[0028] Step 3-1: Deploy multiple speech recognition models;

[0029] Step 3-2: Configure a nautical-specific hot word library and a nautical-specific pronunciation mapping for numbers for each speech recognition model;

[0030] Step 3-3: Output two types of Pinyin sequences in real time: real-time transcription and accurate transcription for each speech recognition model.

[0031] Steps 3-4: Query the Pinyin code table established in Step 1, and convert the Pinyin sequence output in real time in Step 3-3 into a Pinyin code sequence.

[0032] In step 4, each speech recognition model constructs a sliding window of different sizes, and each sliding window is assigned an independent thread.

[0033] In step 4, the various types of ship name sound matching sets are divided into four types of ship name sound matching sets: homophone complete matching set, homophone partial matching set, similar sound complete matching set, and similar sound partial matching set.

[0034] In step 4, the preprocessed dynamic ship name set is a three-dimensional index array. The method for matching the sounds of various types of ship names includes the following steps:

[0035] Step 4-1: Each slide distance is 1 pinyin, and after each slide, iterate through each pinyin code within the sliding window;

[0036] Step 4-2: For each pinyin code, use the pinyin code, the sliding window size, and the position of the pinyin code in the sliding window as parameters to query the three-dimensional index array and obtain the set of homophonic ship name pinyin code entries for the corresponding pinyin code.

[0037] Step 4-3: For each pinyin code, find its approximate sound through the approximate sound index table and form an approximate sound code set; if an approximate sound exists, traverse each approximate sound code in the approximate sound code set, take the approximate sound code, the sliding window size, and the position of the approximate sound code in the sliding window as parameters, query the three-dimensional index array, and add the query result to the approximate sound associated ship name pinyin code entry set of the corresponding pinyin code.

[0038] Step 4-4: Perform an intersection operation on the set of homophone-related ship name pinyin code entries and the set of similar-sound-related ship name pinyin code entries for all pinyin codes in the sliding window, and output the set of homophone complete matches, the set of homophone partial matches, the set of similar-sound complete matches, and the set of similar-sound partial matches.

[0039] In step 3, the number of speech recognition models is set to N. The four types of ship name sound matching sets output by each of the N speech recognition models contain a total of M ship name pinyin code entries. Therefore, in step 5, the method for selecting the best matching ship name from the M ship name pinyin code entries includes the following steps:

[0040] Step 5-1, Model Weight Assignment: Assign weights m to the i-th speech recognition model. i Where 1≤i≤N.

[0041] Step 5-2: Calculate type weights: Calculate the weight w of the k-th type ship name sound matching value. k The specific calculation formula is as follows:

[0042]

[0043] In the formula, T represents the number of pinyin codes contained in a pinyin code sequence A within a certain sliding window.

[0044] H represents the number of homophones that match the pinyin code sequence A, as output by the speech recognition model.

[0045] A represents the number of approximate phonetic matches that the speech recognition model outputs that match the pinyin code sequence A.

[0046] λ is the approximate sound penalty factor, with a value ranging from 0 to 1.0.

[0047] Step 5-3: Determine the indicator function: Let δ be the indicator function for the i-th speech recognition model to output the k-th type of ship name sound matching value. i,k The specific expression is:

[0048]

[0049] Step 5-4: Calculate the weighted credibility: Calculate the weighted credibility C for each ship name pinyin code entry. The specific calculation formula is as follows:

[0050]

[0051] Step 5-5, Threshold Determination: Compare the weighted confidence C of each ship name pinyin code entry with the set confidence threshold.

[0052] Steps 5-6: Output the best matching ship name pinyin code entry: Select the ship name pinyin code entry that exceeds the dynamic confidence threshold and has the highest weighted confidence C as the best matching ship name pinyin code entry and output it.

[0053] In step 5-5, the confidence threshold is a dynamic threshold that can be adaptively adjusted according to the ship density and communication noise level in the VTS-covered waters; when the ship density is high or the communication noise level is high, the confidence threshold will adaptively decrease.

[0054] The approximate tone penalty factor λ can be dynamically adjusted to reduce VHF communication noise; when VHF communication is noisy, the value of the approximate tone penalty factor λ is increased.

[0055] The present invention has the following beneficial effects:

[0056] 1. Improved recognition accuracy: This invention significantly reduces dialect and noise interference by using a dynamic ship name set to narrow the matching range, multi-model collaboration, and hot word configuration. The accuracy of pinyin transcription is improved by more than 30% compared with the direct use of general speech recognition models, achieving an association accuracy of >95% and a false association rate of <1.5%.

[0057] 2. Low resource consumption: The dynamic ship name set is preprocessed into a three-dimensional index array, which reduces the consumption of sliding window matching operations and can process more than 10 models output pinyin code sequences simultaneously on a standard VTS multi-source server.

[0058] 3. Real-time Response: The multi-threaded sliding window matching and lock-free queue design of this invention greatly optimizes the utilization of computing resources. On a single standard-configured VTS multi-source server, it can simultaneously process more than 10 model output pinyin code sequences with a latency controlled within 500ms, meeting the high real-time requirements of VHF voice association with ship names. In addition, this solution is easy to upgrade to a multi-threaded or distributed cluster architecture, further enhancing the system's ability to synchronously process VHF voice from multiple agents in real time, fully meeting the needs of future business expansion.

[0059] 4. The above multi-size sliding window matching algorithm: Construct sliding windows of different sizes based on different ship name lengths, calculate homophone and similar sound matching values ​​in parallel, and dynamically output the best matching result through weighted confidence.

[0060] 5. Dynamic adaptability: The credibility threshold and approximate sound penalty factor can be adaptively adjusted according to environmental noise, enhancing the robustness of the system.

[0061] 6. Strong compatibility and significant cost-effectiveness: This invention has minimal impact on the existing VTS technology architecture, demonstrating excellent compatibility. It can be easily integrated without large-scale modifications, achieving automatic association between VHF voice and ship information, reducing watchkeeping workload. This design approach balances the construction and maintenance costs of the VTS system, maximizing cost-effectiveness and providing users with a low-cost and highly usable solution, greatly enhancing the feasibility and attractiveness of market promotion. Attached Figure Description

[0062] Figure 1 A flowchart of a method for associating ship names with VHF voice signals, applicable to a ship traffic management system, is shown below.

[0063] Figure 2 A schematic diagram of the three-dimensional index array structure in this invention is shown.

[0064] Figure 3 Schematic diagrams of sliding windows of different sizes in this invention are shown.

[0065] Figure 4A schematic diagram of the sliding window matching operation logic in this invention is shown. Detailed Implementation

[0066] The present invention will now be described in further detail with reference to the accompanying drawings and specific preferred embodiments.

[0067] In this embodiment, taking the vessels "Dongfang 01" and "Tianhekou 8" in a certain VTS maritime jurisdiction as an example, and combining the VHF dialogue voice "Hello traffic control, Dongfang Dongyao is exiting Tianhekou and entering the main channel, reporting to you" for a detailed explanation.

[0068] like Figure 1 As shown, a method for associating ship names with VHF voice in a ship traffic management system includes the following steps.

[0069] Step 1: Establish a Pinyin code table and a similar sound index table. The preferred method for establishing this table includes the following steps.

[0070] Step 1-1: Create a Pinyin code table

[0071] A. Assign numeric key values ​​to all common pinyin characters starting from 0, for example:

[0072] dong→0,fang→1,ling→2,yi→3,tian→4,he→5,kou→6,ba→7,hao→8,san→

[0073] 9,…,tong→16,...

[0074] B. Form a one-to-one mapping table and store it as key-value pairs to support fast querying.

[0075] Step 1-2: Establish an index table of approximate sounds

[0076] A. Based on easily confused pronunciations in VHF communication, define approximate sound groups:

[0077]

[0078]

[0079] B. Establish an index table of approximate sounds with the main pinyin code as the index value. For example, when you enter pinyin code 0 (the pinyin code for dong), you can query the associated approximate sound code 16 (the pinyin code for tong).

[0080] Step 2, Dynamic vessel name set preprocessing: Receive and preprocess the dynamic vessel name set provided by the VTS system in real time, preferably including the following steps.

[0081] Step 2-1: Construct a dynamic ship name set. The VTS system forms a dynamic ship name set (such as only including ships in the current water area) by combining the names of ships that may have VHF calls according to the coverage range of VHF system calls.

[0082] In this embodiment, the VTS system screens the ships in the currently covered water area, such as "Dongfang 03" and "Tianhekou No. 8", to generate a dynamic ship name set.

[0083] Step 2-2: Construct a dynamic ship name pinyin entry set. Traverse the dynamic ship name set and convert each ship name text into a ship name pinyin entry one by one to form a dynamic ship name pinyin entry set. Among them, when the ship name contains words with nautical digital readings, multiple ship name pinyin entries are generated.

[0084] In this embodiment, the preferred method for converting ship name texts is as follows:

[0085] (1) Processing of "Dongfang 03":

[0086] Converted into two pinyin entries (including nautical digital readings):

[0087] dong fang dong san ("Dong San" reading)

[0088] dong fang ling san ("Ling San" reading)

[0089] (2) Processing of "Tianhekou No. 8":

[0090] Converted into a standard pinyin entry: tian he kou bahao

[0091] Step 2-3: Construct a three-dimensional index array. Traverse all the ship name pinyin entries in the dynamic ship name pinyin entry set, convert each ship name pinyin entry into a ship name pinyin code entry using the pinyin code table, and store it in the three-dimensional index array. Among them, the first dimension of the three-dimensional index array is the pinyin code corresponding to a single pinyin; the second dimension is the length of the ship name pinyin; the third dimension is the position of the pinyin in the ship name.

[0092] In this embodiment, converting the ship name pinyin entry into a ship name pinyin code entry:

[0093] ■dong fang dong san → [0,1,0,9]

[0094] ■dong fang ling san → [0,1,2,9]

[0095] ■tian he kou ba hao → [4,5,6,7,8]

[0096] Example 1: The pinyin code entry for the ship name "Dongfang 01" is [0,1,0,9], with a length of 4:

[0097] Since dong (code 0) is in the first position, store the pinyin code entry [0,1,0,9] into Q[0][4][1].

[0098] Since fang (code 1) is in the 2nd position, store the pinyin code entry [0,1,0,9] into Q[1][4][2].

[0099] Since dong (code 0) is in the 3rd position, store the pinyin code entry [0,1,0,9] into Q[0][4][3].

[0100] Since san (code 9) is in the 4th position, store the pinyin code entry [0,1,0,9] into Q[9][4][4].

[0101] Example 2: The pinyin code entries for the ship name "Tianhekou No. 8" are [2,3,4,5,6], with a length of 5:

[0102] Since tian (code 2) is in the first position, store the pinyin code entries [2,3,4,5,6] into Q[2][5][1].

[0103] Since he (code 3) is in the 2nd position, store the pinyin code entries [2,3,4,5,6] into Q[3][5][2].

[0104] Since kou (code 4) is in the 3rd position, store the pinyin code entries [2,3,4,5,6] into Q[4][5][3].

[0105] Since ba (code 5) is in the 4th position, store the pinyin code entries [2,3,4,5,6] into Q[5][5][4].

[0106] Since hao (code 6) is in the 5th position, store the pinyin code entries [2,3,4,5,6] into Q[6][5][5].

[0107] In this embodiment, the constructed three-dimensional index array, such as Figure 2 As shown, in Figure 2Taking the 3D index array [6, 5, 3] as an example, this element is a set type. The first dimension has a value of 6, indicating that all the ship name pinyin code sequences stored in it contain the pinyin code 6. The second dimension has a value of 5, indicating that the length of the stored ship name pinyin code sequences is limited to 5. The third dimension has a value of 3, indicating that the third position of the stored ship name pinyin code sequences is the pinyin code 6. This storage method provides a sliding window algorithm for high-performance lookup. For example, if the pinyin code of the sliding window of size Y at position Z is X, then the ship name pinyin code determined by that pinyin code can be retrieved by directly consulting the element value (set) of the 3D index array [Y, Z].

[0108] Step 3, Multi-channel speech-to-text: Receive VHF call voice in real time, process it into text using multiple speech recognition models, and generate multiple output pinyin sequences in two types: real-time transcription (low latency) and accurate transcription (high accuracy). Query the pinyin code table established in Step 1 and convert them into pinyin code sequences respectively.

[0109] The above-mentioned multi-channel speech-to-text method includes the following steps.

[0110] Step 3-1: Deploy N speech recognition models, such as Wav2Vec, Whisper, DeepSpeech, SenseVoice, etc. In this embodiment, N=2, namely Whisper (hereinafter referred to as Model 1) and SenseVoice (hereinafter referred to as Model 2).

[0111] Step 3-2: Configure a maritime-specific hot word library (such as "Hello traffic management" and "Report to you") and a maritime-specific pronunciation mapping for numbers for each speech recognition model.

[0112] Step 3-3: Output two types of Pinyin sequences in real time: real-time transcription and accurate transcription, for each speech recognition model.

[0113] 1. Voice input: The VHF voice message is "Hello traffic management, Dongfangdong is exiting Tianhekou and entering the main channel. We are reporting to you."

[0114] 2. Multi-model transcription and hot word optimization:

[0115] (1) Model 1 (Whisper) output:

[0116] jiao guan li hao,dong fang dong san chu tian he kou,xiang ni bao gao

[0117] (2) Model 2 (SenseVoice) output:

[0118] jiao guang ni hao,dong fang tong san chu tian he kou,xiang ni bao gao

[0119] (3) Hot word replacement: Based on the hot word database, replace “jiao guan li hao” with “jiao guan ni hao”, and replace “jiao guang ni hao” with “jiao guan ni hao”.

[0120] Replace with jiao guan ni hao and correct as follows:

[0121] Model 1 final pinyin sequence:

[0122] jiao guan ni hao, dong fang dong san chu tian he kou, xiang ni bao gao

[0123] Model 2 final pinyin sequence:

[0124] jiao guan ni hao, dong fang tong san chu tian he kou, xiang ni bao gao

[0125] Steps 3-4: Query the Pinyin code table established in Step 1, and convert the Pinyin sequence output in real time in Step 3-3 into a Pinyin code sequence. In this embodiment, the Pinyin code sequences after conversion by the two models are as follows.

[0126] Model 1 Pinyin code sequence: [9,10,11,6,,0,1,0,9,14,4,5,6,,13,11,14,15]

[0127] Model 2 Pinyin code sequence: [9,10,11,6,,0,1,16,9,14,4,5,6,,13,11,14,15]

[0128] Step 4: Matching multiple sliding window sizes

[0129] A. Sliding window initialization

[0130] For each type of speech recognition model, different sized sliding windows are constructed. In this example, nine sliding windows with lengths ranging from 2 to 10 are constructed, and each sliding window is assigned an independent thread.

[0131] B. Traverse each pinyin code sequence within each sliding window, perform retrieval operations using the approximate sound index table and the preprocessed dynamic ship name set, and output multiple types of ship name sound matching sets, preferably four types of ship name sound matching sets: homophone complete matching set, homophone partial matching set, approximate sound complete matching set, and approximate sound partial matching set.

[0132] Each type of ship name sound matching set includes several ship name pinyin code entries. Therefore, the four types of ship name sound matching sets output by each of the N speech recognition models contain a total of M ship name pinyin code entries.

[0133] like Figure 3 As shown, the above-mentioned methods for matching the sounds of various types of ship names preferably include the following steps.

[0134] Step 4-1: Each slide distance is 1 pinyin, and after each slide, iterate through each pinyin code in the sliding window.

[0135] Step 4-2: For each pinyin code, use the pinyin code, the sliding window size, and the position of the pinyin code in the sliding window as parameters to query the three-dimensional index array and obtain the set of homophonic ship name pinyin code entries for the corresponding pinyin code.

[0136] Step 4-3: For each pinyin code, find its approximate sound through the approximate sound index table and form an approximate sound code set; if an approximate sound exists, traverse each approximate sound code in the approximate sound code set, take the approximate sound code, the sliding window size, and the position of the approximate sound code in the sliding window as parameters, query the three-dimensional index array, and add the query result to the approximate sound associated ship name pinyin code entry set of the corresponding pinyin code.

[0137] Step 4-4: Perform an intersection operation on the set of homophone-related ship name pinyin code entries and the set of similar-sound-related ship name pinyin code entries for all pinyin codes in the sliding window, and output the set of homophone complete matches, the set of homophone partial matches, the set of similar-sound complete matches, and the set of similar-sound partial matches.

[0138] Figure 4 The diagram shows the sliding window matching operation logic. The specific matching operation logic is as follows.

[0139] Homophone matching: Directly retrieve the three-dimensional index array through the pinyin code and calculate the values ​​of complete match (H=T) and partial match (H / T≥80%).

[0140] Approximate sound matching: Combine the approximate sound index table to expand the search range and calculate the approximate sound complete match (H+A=T) and partial match ((H+A) / T≥80%) values.

[0141] A. The set of perfect homophone matches includes several perfect homophone values.

[0142] Definition of homophonic exact match value: It is the intersection of the sets of homophonic associated ship name pinyin code entries for all pinyin codes within the sliding window. In other words, all ship name pinyin code entries in this set meet the following conditions: If a ship name pinyin code entry contains T pinyins, and when all pinyin codes within the sliding window match this ship name pinyin code entry, the number of homophonic matching pinyin codes is H, then H = T.

[0143] B. The homophonic partial match set includes several homophonic partial match values

[0144] Definition of homophonic partial match value: The ratio of the set of homophonic associated ship name pinyin code entries for pinyin codes within the sliding window hitting the same ship name pinyin code entry exceeds a preset threshold (such as 80%), and the set of relevant ship name pinyin code entries is the homophonic partial match value. In other words, all ship name pinyin code entries in this set meet the following conditions: If a ship name pinyin code entry contains T pinyins, and when all pinyin codes within the sliding window match this ship name pinyin code entry, the number of homophonic matching pinyin codes is H, then H < T and H / T is greater than the preset threshold (such as 80%).

[0145] C. The approximate homophonic exact match set includes several approximate homophonic exact match values

[0146] Definition of approximate homophonic exact match value: The difference set between the intersection of the set of associated ship names for all pinyin codes within the sliding window (the set of approximate homophonic associated ship name pinyin code entries + the set of homophonic associated ship name pinyin code entries) and the homophonic exact match value is the approximate homophonic exact match value. In other words, all ship name pinyin code entries in this set meet the following conditions: If a ship name pinyin code entry contains T pinyins, and when all pinyin codes within the sliding window match this ship name pinyin code entry, the number of homophonic matching pinyin codes is H, and the number of approximate homophonic matching pinyin codes is A (A ≥ 1), then H + A = T.

[0147] D. The approximate homophonic partial match set includes several approximate homophonic partial match values

[0148] Definition of approximate homophonic partial match value: The ratio of the set of associated ship name pinyin code entries for pinyin codes within the sliding window (the set of approximate homophonic associated ship name pinyin code entries + the set of homophonic associated ship name pinyin code entries) hitting the same ship name pinyin code entry exceeds a preset threshold (such as 80%), and the difference set between the set of relevant ship name pinyin code entries and the homophonic partial match value is the approximate homophonic partial match value. In other words, all ship name pinyin code entries in this set meet the following conditions: If a ship name pinyin code entry contains T pinyins, and when all pinyin codes within the sliding window match this ship name pinyin code entry, the number of homophonic matching pinyin codes is H, and the number of approximate homophonic matching pinyin codes is A (A ≥ 1), then (H + A) < T and (H + A) / T is greater than the preset threshold (such as 80%).

[0149] In this embodiment, the partial matching process between Model 1 and Model 2 is illustrated below.

[0150] (1) Code sequence for Model 1:

[0151] (1.1) Sliding window length 4 (thread 1):

[0152] When the slider is slid to the 6th time, it covers the code segment [0,1,0,9], producing a perfect homonym match.

[0153] Query the three-dimensional index array Q:

[0154] Position 1: Pinyin code [0] → Q[0][4][1] → Contains Pinyin code entry [0,1,0,9]

[0155] Location 2: Pinyin code [1] → Q [1][4][2] → Contains Pinyin code entry [0,1,0,9]

[0156] Position 3: Pinyin code [0] → Q[0][4][3] → Contains Pinyin code entry [0,1,0,9]

[0157] Position 4: Pinyin code [9] → Q [9][4][4] → Contains Pinyin code entry [0,1,0,9]

[0158] The pinyin code entry [0,1,0,9] was completely matched.

[0159] (1.2) Sliding window length 5 (thread 2):

[0160] When sliding to the 11th time, the code segment [4,5,6,7,,13] is overwritten.

[0161] The query matches 3 pinyin syllables in [4,5,6,7,8] (H=3,T=5), where H / T=60% is below the threshold (assumed to be 80%), and homophones are not output.

[0162] (2) Code sequence for Model 2:

[0163] (2.1) Sliding window length 4 (thread 3):

[0164] When sliding to the 6th time, the sliding window covers the code segment [0,1,16,9], producing an approximate perfect match.

[0165] Using the approximate sound index table, tong (code 16) is associated with dong (code 0); the code segment [0,1,16,9] with a sliding window length of 4 is converted into the approximate sound code [0,1,0,9]; the three-dimensional index array Q is queried, H=3, A=1, T=4, that is, H+A=T, which is a perfect match with the approximate sound [0,1,0,9].

[0166] (2.2) Sliding window length 5 (thread 4):

[0167] exist Figure 4 In the example of sliding window matching with a length of 5, it is assumed that there are the following 4 ship name pinyin code entries in the ship dynamic selectable set: [7, 8, 9, 10, 11], [7, 7, 9, 10, 11], [7, 18, 9, 10, 11], [7, 18, 9, 1, 11], and the partial matching preset value is set to 80%.

[0168] When sliding to the 11th time, the code segment [4,5,6,7,,13] is covered, resulting in a matching of homophones.

[0169] If the [4,5,6,7,8] match 3 pinyin (H=3,T=5), and H / T=60% is lower than the threshold (assumed to be 80%), then no output is output.

[0170] Step 5: Output the best matching ship name: Take the ship name pinyin code entries from the various types of ship name sound matching sets output in Step 4, and output the best matching ship name pinyin code entries according to the weighted confidence.

[0171] In this embodiment, the above four threads output homophonic perfect matches {[0,1,0,9]} and near-homophonic perfect matches {[0,1,0,9]} during the sliding process. The following uses the ship name code entry [0,1,0,9] as an example to calculate the credibility.

[0172] The above-mentioned method for outputting the best matching ship name preferably includes the following steps.

[0173] Step 5-1, Model Weight Assignment: Assign weights m to the i-th speech recognition model. i Where 1≤i≤N.

[0174] In this embodiment, Model 1 has higher accuracy with a weight m1 = 0.7, while Model 2 has a weight m2 = 0.3.

[0175] Step 5-2: Calculate type weights: Calculate the weight w of the k-th type ship name sound matching value. k The specific calculation formula is as follows:

[0176]

[0177] In the formula, T represents the number of pinyin codes contained in a pinyin code sequence A within a certain sliding window.

[0178] H represents the number of homophones that match the pinyin code sequence A, as output by the speech recognition model.

[0179] A represents the number of approximate phonetic matches that the speech recognition model outputs that match the pinyin code sequence A.

[0180] λ is the approximate tone penalty factor, which takes a value of 0 to 1.0 and can be dynamically adjusted for VHF communication noise. When VHF communication is noisy, the value of the approximate tone penalty factor λ is increased.

[0181] In this embodiment,

[0182] Homophone matching (H=4, T=4): w1=(4+0) / 4=1.0

[0183] Approximate tone perfect match (H=3, A=1, T=4): w3=(3+0.8×1) / 4=0.95 (assuming λ=0.8)

[0184] Step 5-3: Determine the indicator function: Let δ be the indicator function for the i-th speech recognition model to output the k-th type of ship name sound matching value. i,k The specific expression is:

[0185]

[0186] Step 5-4: Calculate the weighted credibility: Calculate the weighted credibility C for each ship name pinyin code entry. The specific calculation formula is as follows:

[0187]

[0188] In this embodiment, the weighted confidence level C of the ship name code entry [0,1,0,9] is:

[0189] C=0.7×1.0+0.3×0.95=0.985

[0190] Step 5-5, Threshold Determination: Compare the weighted confidence C of each ship name pinyin code entry with the set confidence threshold.

[0191] The aforementioned confidence threshold is a dynamic threshold that can be adaptively adjusted according to the ship density and communication noise level in the VTS-covered waters. When the ship density or communication noise level is high, the confidence threshold will decrease adaptively.

[0192] In this embodiment, the current ship density is low, the confidence threshold is set to 0.9, and the confidence of the matching result [0,1,0,9] is 0.985, which exceeds the threshold.

[0193] Steps 5-6: Output the best matching ship name pinyin code entry: Select the ship name pinyin code entry that exceeds the dynamic confidence threshold and has the highest weighted confidence C as the best matching ship name pinyin code entry and output it.

[0194] In this embodiment, the [0,1,0,9] with the highest credibility is selected, and its corresponding ship name "Dongfang 01" is sent to the VTS system to automatically identify the ship.

[0195] Effect verification

[0196] Accuracy: Model 1 perfectly matches "Dongfang 01", and Model 2 successfully matches after correction by approximation. Overall accuracy > 95%.

[0197] Real-time performance: Multi-threaded parallel processing with sliding window, matching latency <300ms (<500ms), meeting the real-time monitoring requirements of VTS.

[0198] Anti-interference: The approximate sound index table effectively corrects the misidentification of "tong" and "dong" and reduces the impact of noise interference.

[0199] This invention has been piloted and verified by a maritime safety administration's VTS system. Under a high-load scenario with a ship density of 5,000 ships per day, the accuracy of ship name association has been improved to 96.1%, and the false alarm rate has been reduced to below 1.4%, significantly improving the efficiency of maritime supervision.

[0200] The preferred embodiments of the present invention have been described in detail above. However, the present invention is not limited to the specific details in the above embodiments. Within the scope of the technical concept of the present invention, various equivalent transformations can be made to the technical solutions of the present invention, and these equivalent transformations all fall within the protection scope of the present invention.

Claims

1. A method of VHF voice association of ship name suitable for use in a vessel traffic management system, characterized by: Includes the following steps: Step 1: Establish a Pinyin code table and a similar sound index table; Step 2, Dynamic Vessel Name Set Preprocessing: Receive and preprocess the dynamic vessel name set provided by the VTS system in real time; Step 3, Multi-channel speech-to-text: Receive VHF call voice in real time, process it into text using multiple speech recognition models, and generate multiple output pinyin sequences of two types: real-time transcription and accurate transcription. Query the pinyin code table established in Step 1 and convert them into pinyin code sequences respectively. Step 4: Multi-size sliding window matching: For each type of Pinyin code sequence converted by the speech recognition model, sliding windows of different sizes are constructed, and each Pinyin code sequence within each sliding window is traversed. Retrieval operations are performed using an approximate sound index table and a preprocessed dynamic ship name set to output multiple types of ship name sound matching sets. Each type of ship name sound matching set includes several ship name Pinyin code entries. These multiple types of ship name sound matching sets are of four types: homophone complete matching set, homophone partial matching set, approximate sound complete matching set, and approximate sound partial matching set. Step 5: Output the best matching ship name: Take the ship name pinyin code entries from the various types of ship name sound matching sets output in Step 4, and output the best matching ship name pinyin code entries according to the weighted confidence; where the weighted confidence is related to the model weights, type weights and indicator functions.

2. The method for VHF voice association of ship name suitable for VTS system of ships as claimed in claim 1 wherein: Step 1, the method for establishing the Pinyin code table and the approximate sound index table, includes the following steps: Step 1-1: Create a Pinyin code table: Assign numerical key values ​​to all common Pinyin characters starting from 0 to form a one-to-one Pinyin code table; Step 1-2: Establish an index table of similar sounds: Based on the characteristics of VHF voice call services, group easily confused similar sounds to form an index table of similar sounds.

3. The method for VHF voice-associated ship names in a ship traffic management system according to claim 1, characterized in that: Step 2, the preprocessing method for the dynamic ship name set, includes the following steps: Step 2-1: Construct a dynamic set of vessel names: The VTS system constructs a dynamic set of vessel names based on the coverage area of ​​the VHF system, which includes the names of vessels that may be able to conduct VHF calls. Step 2-2: Construct a dynamic set of ship name pinyin entries: Traverse the dynamic ship name set and convert each ship name text into a ship name pinyin entry, thus forming a dynamic set of ship name pinyin entries; when the ship name contains characters with nautical numbers, multiple ship name pinyin entries are generated. Steps 2-3: Construct a three-dimensional index array: Traverse all ship name pinyin entries in the dynamic ship name pinyin entry set, convert each ship name pinyin entry into a ship name pinyin code entry using the pinyin code table, and store it in a three-dimensional index array; where the first dimension of the three-dimensional index array is the pinyin code corresponding to a single pinyin; the second dimension is the length of the ship name pinyin; and the third dimension is the position of the pinyin in the ship name.

4. The method for VHF voice association of ship names in a ship traffic management system according to claim 1, characterized in that: Step 3, the multi-channel speech-to-text method, includes the following steps: Step 3-1: Deploy multiple speech recognition models; Step 3-2: Configure a nautical-specific hot word library and a nautical-specific pronunciation mapping for numbers for each speech recognition model; Step 3-3: Output two types of Pinyin sequences in real time for each speech recognition model: real-time transcription and accurate transcription. Steps 3-4: Query the Pinyin code table established in Step 1, and convert the Pinyin sequence output in real time in Step 3-3 into a Pinyin code sequence.

5. The method for VHF voice association of ship names in a ship traffic management system according to claim 1, characterized in that: In step 4, each speech recognition model constructs a sliding window of different sizes, and each sliding window is assigned an independent thread.

6. The method for VHF voice-associated ship names in a ship traffic management system according to claim 1, characterized in that: In step 4, the preprocessed dynamic ship name set is a three-dimensional index array. The method for matching the sounds of various types of ship names includes the following steps: Step 4-1: Each slide distance is 1 pinyin, and after each slide, iterate through each pinyin code within the sliding window; Step 4-2: For each pinyin code, use the pinyin code, the sliding window size, and the position of the pinyin code in the sliding window as parameters to query the three-dimensional index array and obtain the set of homophonic ship name pinyin code entries for the corresponding pinyin code; Step 4-3: For each pinyin code, find its similar sound through the similar sound index table and form a set of similar sound codes; if a similar sound exists, traverse each similar sound code in the set of similar sound codes, take the similar sound code, the sliding window size and the position of the similar sound code in the sliding window as parameters, query the three-dimensional index array, and add the query result to the set of similar sound associated ship name pinyin code entries of the corresponding pinyin code. Step 4-4: Perform an intersection operation on the set of homophone-related ship name pinyin code entries and the set of similar-sound-related ship name pinyin code entries for all pinyin codes in the sliding window, and output the set of homophone complete matches, the set of homophone partial matches, the set of similar-sound complete matches, and the set of similar-sound partial matches.

7. The method for VHF voice association of ship names in a ship traffic management system according to claim 6, characterized in that: In step 3, the number of speech recognition models is set to N. The four types of ship name sound matching sets output by each of the N speech recognition models contain a total of M ship name pinyin code entries. Therefore, in step 5, the method for selecting the best matching ship name from the M ship name pinyin code entries includes the following steps: Step 5-1, Model Weight Allocation: For the first... i Each speech recognition model is assigned weights. Where, 1≤ i ≤N; Step 5-2, Calculate type weights: Calculate the first... k Weight of ship name sound matching value for each type The specific calculation formula is as follows: ; In the formula, The number of pinyin codes contained in a certain pinyin code sequence A within a certain sliding window; The number of homophones that match the pinyin code sequence A, as output by the speech recognition model; The number of approximate sound matching pinyin sequences that match the pinyin code sequence A, as output by the speech recognition model; This is the approximate sound penalty factor, with a value ranging from 0 to 1.0; Step 5-3: Determine the indicator function: Let the first... i The speech recognition model outputs the first... k The indicator function for matching the sound values ​​of ship names of various types is as follows: The specific expression is: ; Step 5-4: Calculate the weighted credibility: Calculate the weighted credibility C for each ship name pinyin code entry. The specific calculation formula is as follows: ; Step 5-5, Threshold Determination: Compare the weighted confidence C of each ship name pinyin code entry with the set confidence threshold for determination; Steps 5-6: Output the best matching ship name pinyin code entry: Select the ship name pinyin code entry that exceeds the dynamic confidence threshold and has the highest weighted confidence C as the best matching ship name pinyin code entry and output it.

8. The method for VHF voice-associated ship names in a ship traffic management system according to claim 7, characterized in that: In step 5-5, the confidence threshold is a dynamic threshold that can be adaptively adjusted according to the ship density and communication noise level in the VTS-covered waters; when the ship density is high or the communication noise level is high, the confidence threshold is adaptively reduced.

9. The method for VHF voice-associated ship names in a ship traffic management system according to claim 7, characterized in that: Approximate sound penalty factor It can dynamically adjust for VHF communication noise; when VHF communication is in a high-noise state, it increases the approximate tone penalty factor. value.