Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Building data cleaning and merging method and device and storage medium

A data cleaning and real estate technology, applied in the field of data cleaning and return, can solve problems such as low efficiency of return rules, affect subsequent application of real estate data, and ensure data validity, so as to achieve accurate deduplication, ensure validity, and improve cleaning efficiency Effect

Pending Publication Date: 2022-04-26
广州探迹科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] (1) The same real estate has different names and real estate addresses in each source, and the validity of the data has not been ensured through data cleaning and data return;
[0004] (2) Ordinary data cleaning and merging rules are inefficient, and because no corresponding recognition algorithm is introduced to identify and deduplicate the same real estate, there is a large amount of redundant data, which affects the subsequent application of real estate data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Building data cleaning and merging method and device and storage medium
  • Building data cleaning and merging method and device and storage medium
  • Building data cleaning and merging method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The technical solution in the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the present invention. Obviously, the described embodiments are only some embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0058] Such as figure 1 As shown, a real estate data cleaning and merging method provided by an embodiment of the present invention includes the following steps:

[0059] Step S101: Obtain the original real estate data, the real estate data includes real estate name data and real estate address data of each real estate, and establish a mapping relationship between the real estate name data and real estate address data of the same real estate.

[0060] Step S102: Filter the real estate data ac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a building data cleaning and merging method and device and a storage medium. The method comprises the following steps: acquiring multiple pieces of poi information corresponding to each building name according to the building name in the second building data; obtaining a plurality of poi information corresponding to each building address according to the building address in the second building data; combining the multiple pieces of poi information corresponding to the building name and the multiple pieces of poi information corresponding to the building address of the same building to obtain a poi information set of each building; selecting one piece of poi information from the poi information set of each building as first poi information of each building; and judging whether any two buildings are the same building or not according to the first poi information of each building, and de-duplicating the building data which are judged to be the same building in the second building data to obtain third building data. According to the technical scheme, efficient merging of the building data is achieved, and redundant data are greatly reduced.

Description

technical field [0001] The present invention relates to the technical field of data cleaning and merging, in particular to a method, device and storage medium for real estate data cleaning and merging. Background technique [0002] After the crawler completes the data crawling, it clusters the real estate names and real estate addresses to complete the data clustering under multiple data sources; at the same time, it matches the industrial and commercial registration addresses of the enterprises with the real estate addresses, and associates the relevant enterprises with the corresponding ones. Under real estate; so as to support users to obtain data on real estate across the country and mine under real estate companies on the product side. In the prior art, when cleaning and merging real estate data, there are the following disadvantages: [0003] (1) The same real estate has different names and real estate addresses in each source, and the validity of the data has not bee...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/215G06F16/29G06Q50/16
CPCG06F16/215G06F16/29G06Q50/16
Inventor 陈开冉黎展黄俊强秦倜骅
Owner 广州探迹科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products