Drug bid winning information mining method and device
A drug information and information mining technology, applied in the field of data processing, can solve problems such as missing, incomplete enterprise information, and difficult horizontal comparison
Pending Publication Date: 2021-06-08
上海药慧信息技术有限公司
8 Cites 0 Cited by
AI-Extracted Technical Summary
Problems solved by technology
[0004] In related technologies, the main source of bidding data is the drug procurement platforms of various provinces, and there is no unified query method
Moreover, due to the drug recruitment data provided by the drug procurement platforms in various provinces, such as inconsistent or missing drug information expression, incomplete corporate information...
Abstract
The invention provides a drug bid winning information mining method and device, and relates to the technical field of data processing, and the method comprises the steps: obtaining the drug information of a bid inviting platform; calculating a target matching degree between the drug information and standard drug information in a pre-established listed drug standard table; and determining a target index value in the listed drug standard table according to the target matching degree, and storing the target index value into a standard bid-winning drug information table according to a drug information list corresponding to the target index value. Therefore, complete and standard bid-winning drug information can be obtained, so that drug bid-winning information details of each region can be known timely and accurately.
Application Domain
Natural language data processingOther databases querying +3
Technology Topic
Drug standardsData processing +4
Image
Examples
- Experimental program(1)
Example Embodiment
[0061] The embodiments of the present application are described in detail below, and examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are indicated by the same or similar elements or elements having the same or similar functions. The following is exemplary, and is intended to be used to explain the present application without understanding the limitation of the present application.
[0062] The drug winning bid information mining method and apparatus of the present application embodiment will be described below with reference to the drawings.
[0063] figure 1 A flow chart of a drug winning bid information mining method for the present application embodiment.
[0064] Such as figure 1 As shown, the label information mining method of the drug includes the following steps:
[0065] Step 101, obtain the drug information of the bidding platform.
[0066] Step 102: Computing the drug information and the target matching of standard drug information in the pre-established listing drug standard table.
[0067] Step 103: Determine the target index value in the listing drug standard table according to the target matching, and store the standard winning drug information table according to the drug information list corresponding to the target index value.
[0068] In the present application example, obtaining drug information, including, but not limited to, medicinal production enterprises, pharmaceutical production enterprises, drug dosage forms, drug production companies, pharmaceutical production enterprises, pharmaceuticals, drug specifications, prices, etc. Figure 2A Indicated.
[0069] In the present application embodiment, after the original universal name information in the drug information is cleaned, the component name information is obtained according to the general name-ingredient-based correlation pattern; after cleaning the original enterprise information in drug information, it is matched in the enterprise dictionary. , Obtain enterprise name information; respectively, the components information, enterprise name information, and the ingredient information in the listing of the listing of the listing of the listing, and the enterprise name information will be edited, obtain the first match value and the second match value; for drug information In the case of the original agent type information, after the cleaning, it is string and acquires a string matching in the listing drug standard table, obtains a third match value; transitions the original specifications in drug information to perform string matching the specifications in the listing drug standard table. , Obtain the fourth match value; after cleaning the original product name information in the drug information, edit the distance calculation with the product name information in the listing drug standard table, obtain the fifth match value; according to the first match value, second Matching value, third match value, fourth match value, and fifth match value, and weight coefficients corresponding to each matching value perform calculation acquisition target matching.
[0070] Specifically, the acquired original universal name information is cleaned, according to the general name-ingredient-related relationship, the corresponding component name information is obtained, and the acquired original production enterprise information is cleaned and matched in the enterprise dictionary. , The obtained ingredient name and enterprise name information and "Listing Pharmaceutical Standard Table", the enterprise name information calculates the editing distance value Lev, the numerical range: 0-1, based on matching degree p = 1-lev, get Ingredient name information, enterprise name information matching value, acquisition of P-based name ≥ 0.95 and listing standard drug form subsets, such as ≥0.95 Figure 2B Indicated.
[0071] Specifically, the acquired original dosage form information is cleaned, and the cleaning dosage form information is string with the dosage form information in the information subset of the above-mentioned standard pharmaceutical table, obtains the matching value P dosage form (numerical range: 0-1), For the obtained original specifications such as Figure 2C The information subset of the above-mentioned listing standard drug form Figure 2D The specification information shown is based on the normal extraction quality specifications (Mg), volume specifications (ML), active specifications (IU), and percentage specifications (PER), and conversion to standard specifications (such as 1G, conversion) It is a mass specification 1000mg).
[0072] Specifically, the value in each specification matches the character string to obtain P mg , P ml , P per , P iu Value (0OR 1), four specifications set weight, P 规格 = W mg * P mg + W ml * P ml + W per * P per + W iu * P iu.
[0073] Specifically, the acquired original product name information is cleaned, and the product name information after the cleaning is based on editing distance functions and the listing of the listing of the listing of the list of information. The product name information calculates the editing distance Lev value (0-1), P 商品名 = 1-lev.
[0074] Specifically, the product name, ingredient name, enterprise name, dosage form, specifications are given weight value, and the weight value is magnitude (W 商品名W 成分名W 企业名W 剂型W 规格 ), Total matching P 总 = W 商品名 * P 商品名 + W 成分名 * P 成分名 + W 企业名 * P 企业名 + W 剂型 * P 剂型 + W 规格 * P 规格 Get P 总 Maximum value and return to P 总 The index value in the listing of the listing drug standard tables.
[0075] Further, the detailed drug information representative of the index value is obtained, and the standard winning drug information is generated, and the standard winning drug information table is stored. Figure 2E Indicated.
[0076] The drug winning information mining method of the present application is obtained by obtaining drug information of the bidding platform; calculating the target information of the standard drug information in the pre-established listing drug standard table; determines the standard table of the listed drugs according to the target matching The target index value, and stores the standard winning drug information table according to the drug information list corresponding to the target index value. Thereby, it is possible to obtain a complete specification of the label drug information for timely and accurate understanding of the details of the drug winning bid for each region.
[0077] Based on the above embodiment, the present application is pre-established a standard winning drug information table, which is particularly image 3 As shown, including:
[0078] Step 201, obtain the original information of drugs for listed drugs;
[0079] Step 202, analyze the original information of the drug to generate a standard table of the listed drugs.
[0080] In the present application embodiment, raw information such as product name, product name, production enterprise, dosage form, specifications such as the drug name, product name, production enterprise, dosage form, specifications are obtained from the Nations National Drug Administration.
[0081] In the present application example, the original information of the drug is analyzed, and the listing of the listed drug standard form, including: After cleaning the drug name, match the drug dictionary, obtain standard drug name information; after cleaning the drug production enterprise information Matching in a business dictionary, obtains standard enterprise name information; after cleaning the drug dosage form, convert it into a standard dosage form, obtain standard dosage type information; according to the drug name - general name - ingredient-related relationship map, the standard general name corresponding to the drug name Information and standard ingredient name information; according to standard drug name information, standard enterprise name information, standard dosage type information, standard general name information, standard ingredient name information, specification information, trade name information, and index value to generate listing drug standards.
[0082] Specifically, the above-described drug name information is cleaned (eg, remove special characters, character conversion, transfer lowercase processing), match the cleaning data in the drug dictionary, to obtain standard drug name information, for the above production enterprise information Cleaning (such as: removing special characters, character conversion, turn lowercase), match the cleaning data in the corporate dictionary, get standard drug name information, and corporate label (domestic contained OR), cleaning the dosage form ( Such as: 1, English name is converted into Chinese name ('Tab': 'Tablet', 'CAP': 'Capsule', 'DRP': 'Drop', 'Pill': 'Pills', etc.), 2, Different Expressed dosage form is converted to standard dosage form ('ordinary oral piece': 'tablet', 'liquid preparation': 'injection', 'tablet (containing sheet) (Sugar Sugar)': 'Contains ", etc.), Get standard dosage form information, according to the drug name - general name - ingredient-related relationship map, the general name information corresponding to the drug name is obtained; the information name information is obtained; the above information is re-combined according to the corresponding relationship of the original grab information, and generates the standard table of the listed drugs. Each data generates a unique index value such as Figure 4 Indicated.
[0083] Based on the above embodiment, the present application also includes: obtaining the same conversion ratio information set in the same name in the standard label drug information table, and the same name is the same as the name of the new bid drug data, the same Conversion ratio information set; calculation of the conversion ratio of newly added bidding drugs ratio in the conversion ratio information set; calculates the conversion ratio of new winning drug drugs than in the conversion ratio information set 2 ratio; according to the first An emergence ratio, the second proportion of appearance, and the preset threshold (setting according to the application) determines whether the conversion is proportional to abnormal information.
[0084] Specifically, the same conversion ratio information set in the same name of the newly bidding drug data is obtained from the standard label drug information table, and the same conversion ratio information set is the same as the name of the new winning drug data, the same conversion ratio information set; Increasing the transition ratio of the bidding drug ratio is proportional 1 in the conversion ratio information set; the conversion of the newly added bidder is calculated ratio in the proportion of the conversion ratio information set 2; if ratio 1 < 5% and proportion 2 <5%, the conversion ratio information of this information is abnormal.
[0085] Based on the above embodiment, the present application also includes: obtaining a conversion ratio of data marked as an abnormal, and obtains the ingredient name, dosage form, specifications of abnormal data, the same first price information set as the same type; if the first price information set is empty , Obtain the same second price information set as the ingredient name of the abnormal data, the same second price information set; in the second price information, the price difference of the abnormal data is less than the target price information of the price threshold, and the target price information The conversion ratio information is set to the conversion ratio information of the abnormal data, and clear the exception tag.
[0086]Specifically, the acquisition of the conversion ratio of data marked as an abnormal data, the price set of the same amount of data, the dosage form, the specifications, the same, the enterprise type (domestic-funded or foreign investment), if the information is available) Collection is empty, get the same price information set as the name of this data, the same price information is obtained, and the price information closest to this exception data is obtained in the above price information, and the conversion ratio information corresponding to this price information The conversion ratio information for this exception data, and clears the exception tag.
[0087] Based on the above embodiment, the present application also includes: obtaining the original price information set, the conversion ratio information set of the same data as the new winning bid data product name information, the unit price = price / conversion ratio acquisition unit price information set, from the The unit price information is concentrated in the median meditile as standard price; the new data is added to the data price, the unit price is compared to the standard price, and the acquisition and standard price difference is less than the preset price difference (set according to the application) The price is the price of the target unit. If the difference between the target unit price and the standard price is greater than the preset threshold, the target unit price and the standard price are recalculated.
[0088] Specifically, obtain the original price information set of the same data as the new winning data product name information, the conversion ratio information set, based on the unit price = price / conversion ratio acquisition unit price information set, in the price information of the above unit price information Numerical median as standard price, the new data winning data price, unit price is compared with standard prices, and take the closest standard price as the final unit price, if (| Final Unit Price - Standard Price |) / Standard Price>0.75, re-mention the steps to calculate the target unit price and the standard price.
[0089] Thus, by establishing a set of record link algorithms, the winning drug information acquired by the provincial drug recovery platforms and the national listing drug information is recorded, and the full-standard bidder information; secondly, our price information is unified for the price information of the bidding drug. Treatment value minimum packaging unit (sheet / grain / branch) price; reduce artificial, improve data processing speed and accuracy.
[0090] In order to achieve the above embodiment, the present application also proposes a drug winning bid information excavation device.
[0091] Figure 5 A structural diagram of a drug win-bound information excavator provided by the embodiment of the present application embodiment.
[0092] Such as Figure 5 As shown, the drug win bid information excavation device includes: acquisition module 510, calculation module 520, and processing module 530.
[0093] The module 510 is obtained for obtaining drug information for the bidding platform.
[0094] The calculation module 520 is used to calculate the target matching of standard drug information in the pre-established listing of the listing drug standard table.
[0095] Processing module 530 is configured to determine the target index value in the listing drug standard table based on the target matching, and store the standard winning drug information table based on the drug information list corresponding to the target index value.
[0096] The drug winning information excavation device of the present application example, by obtaining drug information of the bidding platform; calculating the target information of the standard drug information in the pre-established listing of the listing drug standard table; determines the standard table of the listed drugs according to the target matching The target index value, and stores the standard winning drug information table according to the drug information list corresponding to the target index value. Thereby, it is possible to obtain a complete specification of the label drug information for timely and accurate understanding of the details of the drug winning bid for each region.
[0097] It should be noted that the foregoing explanation explanation of the drug winning bid information mining method embodiment is also applicable to the drug winning bid information excavation device of this embodiment, and details are not described herein again.
[0098] In order to achieve the above embodiment, the present application also provides a computer device, comprising: a processor, and a memory for storing the processor executable instructions.
[0099] Therein, the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory, for implementing the drug winning bid information mining method proposed by the present application.
[0100] In order to achieve the above embodiment, the present application also proposes a non-tutaneous computer readable storage medium, and when the instruction in the storage medium is executed by the processor, the processor can perform the drug winning bid of the present application Information mining method.
[0101] In order to achieve the above embodiment, the present application also proposes a computer program product, and when the instruction in the computer program product is executed by the processor, the drug winning information mining method for implementing the present application will be implemented.
[0102] Figure 6 A block diagram showing an exemplary computer apparatus suitable for implementing the present application embodiment is shown. Figure 6 The displayed computer device 12 is merely an example, and there is no limit to the function and the range of use of the present application.
[0103] Such as Figure 6 As shown, the computer device 12 is manifested in the form of a general computing device. Component components of computer device 12 may include, but are not limited to, one or more processors or processing unit 16, system memory 28, connecting different system components, including system memory 28, and processing unit 16).
[0104] Bus 18 represents one or more of several types of bus structures, including memory bus, memory controllers, peripheral bus, graphical acceleration port, processor, or a local area bus using any bus structure in a variety of bus structures. For example, these architecture include, but is not limited to, industrial standard architecture (ISA) bus, micro channel architecture; [称 简:] Burner, Enhanced ISA bus, video electronic standard The Association (Video Electronics StandardsAssociation); VESA: VESA) LAN and Peripheral ComponentInterConnection; The following is: PCI) Bus.
[0105] Computer device 12 typically includes a variety of computer system readable media. These media can be any available medium that can be accessed by the computer device 12, including volatile and non-volatile media, movable, and non-movable media.
[0106] Memory 28 can include a computer system readable medium in the form of a volatile memory, such as a random access memory (Random Access Memory;): RAM) 30 and / or cache memory 32. Computer device 12 can further include other removable / non-movable, volatile / non-volatile computer system storage media. For example, the storage system 34 can be used to read and write unmovable, non-volatile magnetic media ( Figure 6 Not displayed, commonly referred to as "hard drive"). in spite of Figure 6 Not shown, there is a disk drive for reading and writing of a movable non-volatile disk (e.g., "floppy disk"), as well as a mobile non-volatile disc (for example: CD read only memory (Compact Disc Read Onlymemory) The following is: CD-ROM), Digital Video Disc Read OnlyMemory; DVD-ROM) or other optical media) Read-written disc drive. In these cases, each drive can be connected to the bus 18 through one or more data media interfaces. Memory 28 can include at least one program product, which has a set of (e.g., at least one) program modules, which are configured to perform the functions of the embodiments of the present application.
[0107] With a set of (at least one) program module 42, the program / utility 40 can be stored in, for example, memory 28, such as, but not limited to, operating systems, one or more applications, other program modules, and program data. Each or some combination of these examples may include an implementation of a network environment. Program module 42 typically performs functions and / or methods in the embodiments described herein.
[0108] Computer device 12 can also communicate with one or more external devices 14 (e.g., keyboards, pointing devices, displays 24, etc.), can also communicate with one or more devices that make users interact with the computer system / server 12, and / Or any device such as a network card, modem, etc.) that makes the computer system / server 12 can communicate with one or more other computing devices. Such communication can be performed by an input / output (I / O) interface 22. Also, computer device 12 can also pass through the network adapter 20 with one or more networks (eg, LAN), Wide Area NetWork; Squains: WAN) and / or public networks, such as the Internet) Communication. As shown, the network adapter 20 communicates with the other modules of the computer device 12 via the bus 18. It should be understood that although not shown in the figure, other hardware and / or software modules can be used in combination with computer device 12, including but not limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive And data backup storage systems, etc.
[0109] The processing unit 16 performs various functional applications and data processing by running the program stored in the system memory 28, such as implementation of the drug winning bid information mining method mentioned in the foregoing embodiment.
[0110] In the description of this specification, a description of the reference terms "one embodiment", "some embodiments", "example", "specifically example", or "some example", etc., meant to combine the specific characteristics described in connection with this embodiment. , Structures, materials or features are included in at least one embodiment or example of the present application. In the present specification, schematic representations of the above terms are not necessarily directed to the same embodiments or examples. Moreover, the specific features, structures, materials or features described may be combined in any one or more embodiments or examples. In addition, in the case of non-conflict, those skilled in the art can combine and combine different embodiments or features described in this specification.
[0111] Moreover, the term "first", "second" is used only for the purpose of describing, and cannot be understood as an indication or implicit relative importance or implicitting the number of techniques indicated. Thus, features with "first", "second" may be indicated or implicitly including at least one of this feature. In the description of the present application, the meaning of "multiple" is at least two, such as two, three, etc., unless otherwise specifically defined.
[0112] Any process or method described in the flowchart or herein can be understood as a module, segment, or part of a code including one or more executable instructions including one or more steps for implementing the steps of custom logic functions or processes. And the range of preferred embodiments of the present application includes additional implementation, in which the function can be performed without pressing or discussed in the order, including the function according to the function, according to the substantially simultaneous manner, which should be performed. It is understood by those skilled in the art of the present invention.
[0113]The logic and / or steps shown in the flowchart or here, for example, may be considered to be a sequencing list for implementing the executable instructions for logic functions, and can be implemented in any computer readable medium. Instructions for instruction execution systems, devices, or devices (such as computer-based systems, systems including processors, or other systems that can take instructions from instruction execution systems, devices or devices, or combined with these instruction execution systems, devices Or use. For this manual, * # * computer readable media * # * can be any of the instructions that can be included, stored, communicate, propagate, or transmit programs for instruction execution systems, devices, or devices, or combined with these instructions to execute system, devices or devices. The device used is used. More specific examples of computer readable media (non-exhausted lists) include the following: electrical connecting portions (electronic devices), portable computer cartridges (magnetic devices), random access memory (RAM), random access memory Read-only memory (ROM), erased editable read only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read only memory (CDROM). In addition, the computer readable medium can even be a paper or other suitable medium that can print the program there, since the editing, interpretation or need, for example, by optically scanning the paper or other medium The process is processed to obtain the program electronically, and then store it in the computer memory.
[0114] It should be understood that the part of the present application can be implemented in hardware, software, firmware, or a combination thereof. In the above embodiment, multiple steps or methods can be implemented in software or firmware stored in the memory and executed by the appropriate instructions. For example, if you use hardware and in another embodiment, it is possible to achieve any of the following techniques well known in the art: having a logic gate circuit for realizing logic functions for data signals. Discrete Logic circuit, a dedicated integrated circuit with a suitable combined logic gate circuit, a programmable gate array (PGA), field programmable gate array (FPGA), and the like.
[0115] One of ordinary skill in the art will appreciate that all or some of the steps that implement the above-described embodiment method are to be completed by the hardware that can be related to instructions, the program can be stored in a computer readable storage medium, which When executed, one of the steps including method embodiments or a combination thereof.
[0116] Further, each functional unit in the various embodiments of the present application can be integrated in a processing module, or each unit is generated separately, or two or more units can be integrated into one module. The above-described integrated modules can be implemented in the form of hardware or in the form of a software function module. The integrated module can be stored in a computer readable storage medium if implemented in the form of a software function module and is used as a separate product.
[0117] The storage medium mentioned above can be a read-only memory, a disk, or a disc or the like. Although the embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and cannot be understood as limiting the present application, and those of ordinary skill in the art can be described in the scope of the present application. EXAMPLES Change, modify, replace, and variants.
PUM


Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.