Road entity data deduplication method and device, computing equipment and medium
A technology for entity data and roads, applied in the field of data processing, can solve the problem of inability to effectively determine repeated intelligence, and achieve the effect of improving the effectiveness of deduplication and processing efficiency.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Example Embodiment
[0029] Example one
[0030] figure 1 It is a flowchart of a road entity data de-duplication method provided in the first embodiment of the present invention. This embodiment can be applied to road source data obtained from the Internet, such as various network media data related to roads. In the case of removing the duplicate data of the road, especially for the case where full-text analysis is required to determine whether the road source data is data describing repeated road entity events, the road entity data can be understood as effective data with processing value in the road source data. The method can be executed by a road entity data deduplication device, which can be implemented in software and / or hardware, and can be integrated in any computing device, including but not limited to a server.
[0031] Such as figure 1 As shown, the road entity data deduplication method provided in this embodiment may include:
[0032] S110. Obtain at least one road source data, and classify ...
Example Embodiment
[0043] Example two
[0044] figure 2 It is a flowchart of a road entity data deduplication method provided in the second embodiment of the present invention. This embodiment is further optimized and extended on the basis of the above-mentioned embodiment. Such as figure 2 As shown, the method can include:
[0045] S210. Obtain at least one road source data, and classify the at least one road source data into at least one data subset according to the road entity event type, where one data subset corresponds to a road entity event type, and the road source data is used to describe the road. Entity event.
[0046] S220: Use the first word obtained by word segmentation on the text content corresponding to each road source data in each data subset as a road candidate word.
[0047] The first word in this embodiment refers to a certain number of words that have the characteristics of describing road names. Exemplarily, in the process of parsing the text content corresponding to each roa...
Example Embodiment
[0061] Example three
[0062] image 3 It is a flowchart of a road entity data deduplication method provided in the third embodiment of the present invention. This embodiment is further optimized and extended on the basis of the foregoing embodiment. Such as image 3 As shown, the method can include:
[0063] S310. Obtain at least one road source data, and classify the at least one road source data into at least one data subset according to the road entity event type, where one data subset corresponds to a road entity event type, and the road source data is used to describe the road. Entity event.
[0064] S320: Determine the road name and geographic area name in the text content corresponding to each road source data in each data subset.
[0065] S330. For at least one road name and at least two geographical area names determined from the text content corresponding to each road source data in each data subset, according to the affiliation relationship between the road and the geogra...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap