Road entity data deduplication method and device, computing equipment and medium
A technology for entity data and roads, applied in the field of data processing, can solve the problem of inability to effectively determine repeated intelligence, and achieve the effect of improving the effectiveness of deduplication and processing efficiency.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0030] figure 1 It is a flow chart of the road entity data deduplication method provided by Embodiment 1 of the present invention. This embodiment can be applied to road source data obtained from the Internet, such as various network media data related to roads, for the same road entity In the case of deduplicated data, especially for the situation where a full-text analysis is required to determine whether the road source data is data describing repeated road entity events, the road entity data can be understood as effective data with processing value in the road source data. The method can be executed by a road entity data deduplication device, which can be implemented in the form of software and / or hardware, and can be integrated into any computing device, including but not limited to a server.
[0031] Such as figure 1 As shown, the road entity data deduplication method provided in this embodiment may include:
[0032] S110. Obtain at least one piece of road source data,...
Embodiment 2
[0044] figure 2 It is a flow chart of the method for deduplicating road entity data provided by Embodiment 2 of the present invention. This embodiment further optimizes and expands on the basis of the foregoing embodiments. Such as figure 2 As shown, the method may include:
[0045] S210. Obtain at least one piece of road source data, and classify at least one piece of road source data into at least one data subset according to the type of road entity event, wherein one data subset corresponds to one type of road entity event, and the road source data is used to describe the road Entity event.
[0046] S220. Use the first words obtained by segmenting the text content corresponding to each road source data in each data subset as road candidate words.
[0047] The first word in this embodiment refers to a certain number of words that describe road names. Exemplarily, in the process of parsing the text content corresponding to each road source data, the forward maximum matc...
Embodiment 3
[0062] image 3 It is a flow chart of the method for deduplicating road entity data provided by Embodiment 3 of the present invention. This embodiment further optimizes and expands on the basis of the foregoing embodiments. Such as image 3 As shown, the method may include:
[0063] S310. Obtain at least one piece of road source data, and classify at least one piece of road source data into at least one data subset according to the road entity event type, wherein one data subset corresponds to one road entity event type, and the road source data is used to describe the road Entity event.
[0064] S320. Determine the road name and the geographical area name in the text content corresponding to each road source data in each data subset.
[0065] S330, for at least one road name and at least two geographical area names determined from the text content corresponding to each road source data in each data subset, according to the affiliation relationship between the road and the ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com