Method and device for excavating attribute name repeat
A technology of attribute and phrase pairs, which is applied in natural language data processing, special data processing applications, network data retrieval, etc., and can solve problems such as obvious attribute names and differences
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0056] figure 1 The flowchart of the method for mining attribute name repetition provided by the first embodiment of the present invention, as figure 1 As shown, the method may include the following steps:
[0057] Step 101: Obtain at least one resource among Q-Q, Q-T, and T-T from a query log as a candidate sentence pair.
[0058] The purpose of this step is to obtain the sentence pair resources used for subsequent mining from the query log. The querylog records the data of the user's query session (session) and the clicked page title (title). The specific querylog used can be the query of a specified period of time. log, such as the query log of a day.
[0059] The above Q-Q refers to a query-query pair, which refers to two queries searched by a user in a session, and these two queries may have the same meaning.
[0060] The above Q-T refers to the query-clicked title pair, which refers to the query and the corresponding clicked title. Usually, the semantics between the q...
Embodiment 2
[0091] figure 2 The structure diagram of the apparatus for repeating the mining attribute name provided in the second embodiment of the present invention, as shown in figure 2 As shown, the apparatus includes: a candidate sentence pair acquisition unit 201 , a first phrase pair extraction unit 202 , a second phrase pair extraction unit 203 and a noise filtering unit 204 .
[0092] The candidate sentence pair obtaining unit 201 obtains at least one resource among Q-Q, Q-T and T-T from the query log as a candidate sentence pair, where Q-Q is a sentence pair formed by two queries searched by the user in a session, and Q-T is the query and the corresponding sentence pair. The sentence pair formed by the clicked title, T-T is the sentence pair formed by the two clicked titles corresponding to the same query.
[0093] The first phrase pair extracting unit 202 extracts phrase pairs with the same context from each candidate sentence pair as candidate paraphrase phrase pairs. Speci...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 