Apparatus and method for automatically verifying quality data associated with at least one slide
The apparatus and method automate the verification of slide quality data by identifying and aligning stained cores to standard formats, addressing manual inconsistencies and enhancing reproducibility in diagnostic workflows.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- PRAMANA INC
- Filing Date
- 2025-12-03
- Publication Date
- 2026-06-22
AI Technical Summary
Traditional methods for verifying quality data associated with slides rely on manual scrutiny and subjective interpretation, leading to inconsistencies and inefficiencies in diagnostic workflows, and lack standardized processes for identifying stained cores and aligning them into a reproducible format.
An apparatus and method that utilizes a processor and memory to receive digital slides, determine slide identification through metadata, locate stained cores using object detection techniques, align them to a standard format, and evaluate them against control cores using predetermined thresholds, generating verification outputs for downstream devices.
This approach enhances the accuracy and reliability of slide evaluation by standardizing the process, reducing inconsistencies, and ensuring reproducible quality data verification.
Smart Images

Figure 2026101624000001_ABST
Abstract
Description
[Technical Field]
[0001] This invention relates generally to the field of pathology. In particular, the invention relates to an apparatus and method for automatically verifying quality data associated with at least one slide. [Background technology]
[0002] Traditional methods for verifying quality data associated with slides often rely on manual scrutiny and subjective interpretation, leading to inconsistencies and inefficiencies in diagnostic workflows. Furthermore, the lack of standardized processes for identifying stained cores and aligning them into a reproducible format presents challenges in achieving accurate and reliable evaluation of digital slides. [Overview of the project] [Means for solving the problem]
[0003] In one embodiment, an apparatus for automatically verifying quality data associated with at least one slide includes at least one processor and a memory communicably connected to the at least one processor. The memory includes instructions that configure the processor to perform: receiving at least one digital slide corresponding to at least one slide from at least one optical device; determining slide identification using metadata; locating at least one digital slide using at least one object detection technique, wherein locating at least one digital slide includes identifying a stained core of a target and aligning the stained core of a target to a standard format; and evaluating a stained core of a target using a predetermined threshold, wherein evaluating includes comparing the stained core of a target to a control core, generating a verification output as a function of the comparison between the stained core of a target and the control core, and transmitting the verification output to a downstream device.
[0004] In another embodiment, a method for automatically verifying quality data associated with at least one slide includes receiving at least one digital slide corresponding to at least one slide from at least one optical device; determining slide identification using metadata; locating at least one digital slide using at least one object detection technique, wherein locating at least one digital slide includes identifying a stained core of a target and aligning the stained core of a target to a standard format; and evaluating a stained core of a target using a predetermined threshold, wherein the evaluation includes comparing the stained core of a target to a control core, generating a verification output as a function of the comparison between the stained core of a target and the control core, and transmitting the verification output to a downstream device.
[0005] These and other aspects and features of non-limiting embodiments of the present invention will become apparent to those skilled in the art when the following description of a particular non-limiting embodiment of the present invention is considered in conjunction with the accompanying drawings.
[0006] For the purpose of illustrating the present invention, the drawings illustrate aspects of one or more embodiments of the present invention. However, it should be understood that the present invention is not limited to the exact arrangement and means shown in the drawings. [Brief explanation of the drawing]
[0007] [Figure 1] Figure 1 is a block diagram of a device for automatically verifying quality data associated with at least one slide. [Figure 2A] Figure 2A is an example of a control tissue microarray for each slide. [Figure 2B] Figure 2B is an illustrative diagram of a batch control tissue microarray. [Figure 3A] Figure 3A is an illustrative diagram of a control tissue microarray with one labeled core. [Figure 3B]Figure 3B is an illustrative diagram of a control tissue microarray having two cores of a 1x2 matrix array. [Figure 3C] Figure 3C is an illustrative diagram of a control tissue microarray having five cores, consisting of a 4×1 matrix and a 1-matrix array. [Figure 3D] Figure 3D is an illustrative diagram of a control tissue microarray having nine cores in a 3x3 matrix arrangement. [Figure 3E] Figure 3E is an illustrative diagram of a control tissue microarray having 10 cores in a 2x5 matrix arrangement. [Figure 3F] Figure 3F is an illustrative diagram of a control tissue microarray having five cores in a 1×5 matrix array. [Figure 4] (A) is an illustrative diagram of a 2x4 tissue microarray having a target core at position 3 for a given stain. (B) is an illustrative diagram of a mirrored configuration along the vertical axis of a 2x4 tissue microarray, with the target core appearing in row 1, column 2. (C) is an illustrative diagram of a mirrored configuration along the horizontal axis of a 2x4 tissue microarray, with the target core appearing in row 2, column 3. (D) is an illustrative diagram of how mirrored configurations are detected using the texture and color of a specific core. [Figure 5A] Figure 5A is an illustrative diagram showing the workflow for calculating the DAB value for each stain. [Figure 5B] Figure 5B is an illustrative diagram showing a statistical model for multiple stains. [Figure 6] Figure 6 is an example of a workflow for quality control of new slides. [Figure 7] Figure 7 is a block diagram of an exemplary method for automatically verifying quality data associated with at least one slide. [Figure 8] Figure 8 is a block diagram of an exemplary machine learning process. [Figure 9] Figure 9 is a diagram illustrating an exemplary embodiment of a neural network. [Figure 10]Figure 10 is a diagram illustrating an exemplary embodiment of a neural network node. [Figure 11] Figure 11 is a block diagram of a computing system that may be used to implement any one or more of the methodologies disclosed herein and any one or more of them. [Modes for carrying out the invention]
[0008] Drawings are not necessarily to scale and may be represented by imaginary lines, graphic symbols, and partial drawings. In certain cases, details not necessary for understanding the embodiment, or details that would make it difficult to perceive other details, may be omitted.
[0009] In summary, aspects of this disclosure relate to an apparatus and method for automatically verifying quality data associated with at least one slide. The apparatus includes at least one computing device, which includes a processor and memory communicably connected to the processor. The memory instructs the processor to receive at least one digital slide corresponding to at least one slide from at least one optical device. The processor uses metadata to determine slide identification. The processor uses at least one object detection technique to locate at least one digital slide, and locating at least one digital slide includes identifying the stained core of the target and aligning the stained core of the target to a standard format. Furthermore, the processor evaluates the stained core of the target using a predetermined threshold, and evaluating includes comparing the stained core of the target to a control core, generating a verification output as a function of the comparison between the stained core of the target and the control core, and transmitting the verification output to a downstream device.
[0010] Referring here to Figure 1, an exemplary embodiment of a device 100 for automatically verifying quality data associated with at least one slide 110 is shown. The device 100 may include a processor 102 communicatively connected to a memory 104. As used in this disclosure, “communicatively connected” means connected by a connection, mounting or link between two or more relational terms that enables the reception and / or transmission of information between them. For example, but not limited to, such a connection may be between two or more components, circuits, devices, systems, etc., wired or wireless, direct or indirect, enabling the reception and / or transmission of data and / or signals between them. The data and / or signals between them may include, but not limited to, electrical, electromagnetic, magnetic, video, audio, radio, and microwave data and / or signals, combinations thereof, etc. The communication connection may be achieved, for example, directly or through one or more intervening devices or components via wired or wireless electronic, digital, or analog communication. Furthermore, a communication connection may include electrically coupling or connecting at least one output of one device, component, or circuit to at least one input of another device, component, or circuit, for example, via a bus or other equipment for intercommunication between elements of a computing device, for example, but not limited to. A communication connection may also include indirect connections via, for example, wireless connections, radio communications, low-power wide-area networks, optical communications, magnetic coupling, capacitive coupling, or optical coupling, for example, but not limited to. In some cases, the term “communicatively coupled” may be used in this disclosure instead of “communicatively connected.”
[0011] Continuing to refer to FIG. 1, the memory 104 may include a primary memory and a secondary memory. The "primary memory", also known as "random access memory" (RAM) for the purposes of the present disclosure, is a short-term storage device where information is processed. In one or more embodiments, during the use of the computing device, instructions and / or information may be sent to the primary memory where the information can be processed. In one or more embodiments, information may be input into the primary memory only while a particular software is running. In one or more embodiments, after the computing device is turned off and / or after the use of the software is terminated, the information in the primary memory is erased and / or removed. In one or more embodiments, the primary memory may be referred to as "volatile memory", and volatile memory holds information only while the data is being used and / or processed. In one or more embodiments, volatile memory may lose information after a power loss. The "secondary memory", also known as "storage", "hard disk drive", etc. for the purposes of the present disclosure, is a long-term storage device where the operating system and other information are stored. In one or more embodiments, information may be retrieved from the secondary memory and sent to the primary memory during use. In one or more embodiments, the secondary memory may be referred to as non-volatile memory, and information is stored even during a power loss. In one or more embodiments, the data in the secondary memory cannot be accessed by the processor 102. In one or more embodiments, the data is transferred from the secondary memory to the primary memory, and the processor 102 can access the information from the primary memory.
[0012] Referring further to Figure 1, the device 100 may include a database. The database may include a remote database. The database may be implemented as, but is not limited to, a relational database, a key-value lookup database such as a NoSQL database, or any other format or structure for use as a database that a person skilled in the art would recognize as appropriate when considering the entirety of this disclosure. The database may also be implemented using a distributed data storage protocol and / or data structure such as a distributed hash table. The database may include multiple data entries and / or records as described above. Data entries in the database may be flagged with or linked to one or more additional elements of information, and those additional elements may be reflected in linked tables such as tables associated by data entry cells and / or indexes in a relational database. A person skilled in the art will recognize, when considering the entirety of this disclosure, a variety of ways in which data entries in a database can store, retrieve, organize, and / or reflect data and / or records.
[0013] Continuing to refer to FIG. 1, the apparatus 100 includes, without limitation, a server such as a remote server, a cloud server, a network server, and / or can be communicatively connected to such a server. In one or more embodiments, the computing device can be configured to transmit one or more processes that are to be executed by the server. In one or more embodiments, the server can include additional and / or increased processor power and can execute one or more processes as described hereinafter by the server. For example, without limitation, one or more processes associated with machine learning can be executed by a network server, where data is transmitted to the server, processed, and sent back to the computing device. In one or more embodiments, the server can be configured to execute one or more processes as described hereinafter to enable an increase in the computing power and / or a decrease in the power consumption by the apparatus computing device. In one or more embodiments, the computing device can transmit a process to the server, and the computing device can conserve power or energy.
[0014] Referring further to Figure 1, the device 100 may include any “computing device” as described in this disclosure, including, but not limited to, a microcontroller, microprocessor, digital signal processor (DSP), and / or system on a chip (SoC) as described in this disclosure. The device 100 may include, be included in, and / or communicate with a mobile device, including a mobile device such as a mobile phone or smartphone. The device 100 may include a single computing device operating independently, or two or more computing devices operating in cooperation, in parallel, sequentially, etc., and two or more computing devices may be included in a single computing device or together in two or more computing devices. The device 100 may interface with or communicate with one or more additional devices via a network interface device, as will be described in more detail later. The network interface device may be used to connect the processor 102 to one or more of various networks and one or more devices. Examples of network interface devices include, but are not limited to, network interface cards (e.g., mobile network interface cards, LAN cards), modems, and any combination thereof. Examples of networks include, but are not limited to, wide area networks (e.g., the Internet, corporate networks), local area networks (e.g., networks associated with offices, buildings, campuses, or other relatively small geographical spaces), telephone networks, data networks associated with telephone / voice providers (e.g., mobile communications provider data and / or voice networks), direct connections between two computing devices, and any combination thereof. Networks may use wired and / or wireless modes of communication. In general, any network topology may be used.Information (e.g., data, software, etc.) can be communicated to and from a computer and / or computing device. The processor 102 may include, for example, a computing device or cluster of computing devices at a first location and a second computing device or cluster of computing devices at a second location, but is not limited to these. The device 100 may include one or more computing devices dedicated to data storage, security, traffic distribution for load balancing, etc. The device 100 can distribute one or more computing tasks, as described below, across multiple computing devices of computing devices that can operate in parallel, serial, redundantly, or in any other way used for distributing tasks or memory between computing devices. In a non-limiting example, the device 100 may be implemented using a “shared nothing” architecture.
[0015] Continuing to refer to Figure 1, the processor 102 may be designed and / or configured to execute any method, process, or sequence of process steps in any embodiment described herein in any order and at any degree of iteration. For example, the processor 102 may be configured to repeatedly execute a single process or sequence until a desired or instructed result is achieved, and the iteration of the process or sequence of process steps is performed using the output of the previous iteration as input to the subsequent iteration, and the inputs and / or outputs of the iterations can be aggregated to produce an aggregated result, one or more variables such as global variables can be decreased or decimated, and / or larger processing tasks can be divided into a set of smaller processing tasks that are iteratively dealt with. The processor 102 may execute any process or sequence of process steps as described herein in parallel, such as executing the process simultaneously and / or substantially simultaneously two or more times using two or more parallel threads, processor cores, etc., and the task division between parallel threads and / or processes may be performed according to any protocol suitable for task division between iterations. Those skilled in the art, upon reviewing the entirety of this disclosure, will recognize a variety of ways in which processes, sequences of processes, processing tasks, and / or data can be subdivided, shared, or otherwise processed using iterative, recursive, and / or parallel processing.
[0016] Referring further to Figure 1, at least one optical device 106 comprises at least one camera 108 configured to scan at least one slide 110, and at least one slide 110 comprises two or more stained cores 112. As used in this disclosure, “optical device” is a device designed to manipulate, control, or utilize light. Light may include, but is not limited to, visible light, ultraviolet light, infrared light, etc. In an unlimiting example, the optical device 106 can be used to visualize, capture, or quantify features of a tissue sample 116 arranged in array format. The optical device 106 may be essential for processes such as imaging, analyzing, or spectroscopic evaluation of tissue. In another unlimiting example, the optical device 106 may include a fluorescence microscope. Subsequently, the fluorescence microscope can be used to visualize labeled biomarkers within the tissue sample 116. For example, the fluorescence microscope may be equipped with filters tuned to detect specific fluorophores and can highlight the expression of proteins stained with fluorescent antibodies. In yet another unlimiting example, a digital slide scanner may function as the optical device 106. Subsequently, the digital slide scanner may include a high-resolution optical system for capturing detailed images of the entire tissue microarray slide, enabling downstream digital analysis such as quantification of stained areas using software tools. Where used in this disclosure, “tissue microarray” is an experimental tool consisting of a paraffin block or other substrate containing multiple small tissue samples 116 arranged in a grid pattern, the tissue samples being sectioned and mounted on a microscope slide for high-throughput analysis. The tissue samples 116 in the tissue microarray may be derived from different patients, experimental conditions, or anatomical sites, enabling simultaneous testing of multiple specimens under standardized conditions. Tissue microarrays can be used for applications such as biomarker discovery, drug testing, or comparative pathology studies. In another non-limiting example, the optical device 106 may include a spectrophotometer integrated with a tissue microarray analysis system for measuring the absorbance or reflectance properties of stained tissue spots, enabling quantitative assessment of biomarker concentrations.As used in this disclosure, “slide” is a flat piece of material designed to hold a specimen or sample for examination, imaging, or analysis under an optical device 106 or related instrument. In non-limiting examples, slide 110 may be made from glass, plastic, or the like. In non-limiting examples, slide 110 may be rectangular in shape. Not limiting, slide 110 may be used to present tissue sections, cells, or other samples for microscopic examination or related research. In non-limiting examples, slide 110 may be a standard glass microscope slide approximately 25 × 75 mm in size, used to hold tissue sections that are attached with adhesive and covered with a coverslip for histological analysis. In another non-limiting example, slide 110 may be a positively charged glass slide designed to enhance the adhesion of tissue sections or cells, which is particularly useful in immunohistochemistry (IHC) or in situ hybridization (ISH) applications. In another non-limiting example, slide 110 may refer to a tissue microarray slide in which multiple small tissue cores are embedded in a paraffin block, sectioned, and mounted on slide 110. This format can then enable high-throughput analysis of multiple samples on a single slide. In another non-limiting example, slide 110 may be a specialized digital pathology slide designed for use with a full-slide imaging scanner, enabling the digitization of sample images for computation or remote analysis. As used in this disclosure, “stained core” is a tissue core treated with a specific stain or dye to highlight specific cellular, molecular, or structural features. As used in this disclosure, “tissue core” is a sample or segment of living tissue extracted for analysis, testing, or use. Stained cores may be derived from a tissue microarray (TMA) and processed to visualize specific components, such as proteins, nucleic acids, or cellular structures, under an optical device 106 for diagnostic or research purposes.In a non-limiting example, the stained core may be a tissue core treated with hematoxylin and eosin (H&E) staining. Subsequently, hematoxylin stains the cell nucleus blue, while eosin stains the cytoplasm and extracellular components pink, providing an overview of the tissue morphology. In another non-limiting example, the stained core may include a tissue core subjected to immunohistochemical (IHC) staining, where a chromogenic substrate such as diaminobenzidine (DAB) produces a brown color to indicate the presence of the target protein. In yet another non-limiting example, the stained core may be fluorescently labeled, where the tissue core is treated with a fluorescent dye or an antibody conjugated with a fluorophore, allowing visualization of specific molecular markers using a fluorescence microscope. In other examples, though not limited to them, stained cores may include cores treated with special stains such as Masson's trichrome for collagen or periodic acid-Schiff (PAS) for carbohydrates, which are often used to identify specific histological components in tissue sample 116.
[0017] Continuing with reference to Figure 1, in a non-limiting example, the optical device 106 may correspond to one or more embodiments of the apparatus described in U.S. Patent Application No. 18 / 382,345, entitled “SYSTEM AND METHODS FOR DETECTING AND CLEANING CONTAMINANTS FROM AN IMAGING OPTICAL PATH,” filed October 20, 2023, Agent Reference No. 1519-105USU1, which is incorporated entirely herein by reference.
[0018] Continuing to refer to Figure 1, at least one slide 110 may comprise an immunohistochemistry slide 114, and two or more stained cores 112 of the immunohistochemistry slide 114 may comprise a tissue sample 116. As used in this disclosure, “immunohistochemistry slide” is a prepared microscope slide used in immunohistochemistry (IHC) to analyze the presence and localization of a particular antigen in a living tissue. As used in this disclosure, “tissue sample” is a portion of living tissue obtained from an organism. The tissue sample 116 may be derived from any organ, structure, or system within an organism and may comprise cellular components, extracellular components, and structural components. Subsequently, the tissue sample 116 may be stored, processed, or otherwise prepared to facilitate examination under specific experimental or diagnostic conditions. The immunohistochemistry slide may comprise a tissue sample 116 affixed to its surface, which may be treated with an antibody that binds to a target antigen. Subsequently, the bound antibody may be visualized by chromogenic labeling or fluorescence labeling, allowing for detailed observation of molecular expression patterns and tissue structures under a microscope. In a non-limiting example, an immunohistochemistry slide 114 may be prepared using a sample of human breast tissue to detect the presence of estrogen receptors. The tissue sample 116 may be immobilized on the immunohistochemistry slide 114, treated with a primary antibody specific to estrogen receptors, and then incubated with a secondary antibody conjugated to a chromogenic enzyme. The resulting color change may indicate the presence and distribution of estrogen receptors in the tissue. In another non-limiting example, an immunohistochemistry slide 114 may be used in nerve tissue studies to identify amyloid-beta plaques associated with Alzheimer's disease. The immunohistochemistry slide 114 may be treated by applying an antibody specific to amyloid-beta, followed by a secondary antibody labeled with a fluorescent dye. Under a fluorescence microscope, the immunohistochemistry slide 114 can reveal the location and intensity of amyloid-beta deposits. Following the previous non-limiting example, the immunohistochemistry slide 114 can be used to investigate the expression of cancer biomarkers such as HER2 / neu in a sample of gastric tissue.Next, after the tissue is attached to the immunohistochemistry slide 114, it can undergo a series of steps including antigen recovery, antibody incubation, and color visualization. While not limited to these, the results on the immunohistochemistry slide 114 can help pathologists diagnose the presence of HER2-positive cancer.
[0019] Continuing to refer to Figure 1, the staining process for immunohistochemistry slide 114 may include tissue fixation, sectioning, deparaffinization, antigen recovery, blocking, antibody application, visualization using a chromogenic or fluorescent substrate, and counterstaining before microscopy. For example, the process may begin with fixing the tissue sample to preserve its structure and prevent degradation. For example, fixing the tissue sample may include the use of formalin solution. Subsequently, the tissue may be embedded in paraffin to create a stable block for sectioning. For example, the tissue sections may be placed on a microscope slide. For example, the paraffin may be removed by a deparaffinization process which may include treatment with xylene and alcohol. Subsequently, the tissue sections may undergo antigen recovery to expose the epitopes and allow antibodies to bind effectively. For example, this step may include heat-induced epitope retrieval (HIER) or enzymatic digestion. Next, a blocking solution can be applied to the tissue to minimize nonspecific binding of the antibody. Following this, a primary antibody designed to specifically bind to the target antigen can be applied to the immunohistochemistry slide 114. After incubation, the immunohistochemistry slide 114 can be washed to remove unbound antibodies. Optionally, a secondary antibody conjugated with a detection enzyme or fluorophore can then be added to bind to the primary antibody. Finally, a chromogenic substrate or fluorescent substrate can be introduced to visualize the antigen-antibody complex and generate a colored or fluorescent signal at the antigen site. The immunohistochemistry slide 114 can be counterstained to highlight general tissue morphology, a coverslip can be attached, and it is typically ready for microscopic examination. Each of these steps can be optimized depending on the tissue type, target antigen, and detection system used.
[0020] Referring further to Figure 1, at least one processor 102 is configured to receive at least one digital slide 118 corresponding to at least one slide 110 from at least one optical device 106. As used in this disclosure, “digital slide” is a digital representation of a physical slide created by scanning or imaging the physical slide using an optical device or imaging device. The digital slide 118 may include visual data corresponding to a tissue sample 116, cells, or other specimen present on the physical slide, enabling analysis, visualization, and sharing in a digital format. In a non-limiting example, the digital slide 118 may represent a digital capture image or dataset corresponding to a physical slide containing a tissue sample 116. The processor 102 is then programmed to process, analyze, or store the digital slide 118, enabling manipulation of the visual data for diagnostic, research, or educational purposes. The optical device 106 may perform an initial capture of slide features, which the processor 102 can then interpret or use in further computational tasks.
[0021] Referring further to Figure 1, at least one processor 102 is configured to use metadata 120 to determine slide identifier 122. As used in this disclosure, “metadata” is descriptive or structural information that provides context, attributes, or characteristics about a primary dataset, object, or resource. Metadata 120 may include details such as creation date, creator, file type, format, location, or specific parameters related to the primary data. But not limited to this, metadata 120 may help facilitate the organization, discovery, management, or interpretation of related data. As used in this disclosure, “slide identifier” is a unique identifier associated with a particular slide. In a non-limiting example, slide identifier 122 may include a unique identifier for a microscope slide or digital slide 118, which facilitates its recognition, tracking, and association with related data or metadata 120. But not limited to this, slide identifier 122 may include an alphanumeric code, barcode, QR code®, or electronic identifier, and may include details such as sample type, preparation date, patient or specimen information, and laboratory annotations. In a non-limiting example, the processor 102 may receive metadata 120 from the optical device 106, which may include information such as the scanning date, magnification level, and a barcode embedded in the physical microscope slide. Subsequently, using the metadata 120, the processor 102 may determine a slide identifier 122, such as "SLD-20241130-001," where the prefix "SLD" indicates slide 110, the date corresponds to when slide 110 was scanned, and the numerical subscript provides a unique identifier for that particular slide. The determined slide identifier 122 is then linked to its digital slide 118 and stored in a laboratory information management system (LIMS) for easy retrieval, ensuring traceability and accurate association with corresponding patient or experimental data.
[0022] Continuing to refer to Figure 1, in a non-limiting example, the metadata 120 used to derive slide identifier 122 may correspond to one or more embodiments of the apparatus described in U.S. Patent Application No. 18 / 774,574, filed July 16, 2024, Agent Reference No. 1519-161USU1, which is incorporated herein by reference in its entirety.
[0023] Continuing to refer to Figure 1, determining the slide identifier 122 may involve using an image processing algorithm to extract metadata 120 from the slide label 124. As used in this disclosure, “slide label” is a physical or digital marker associated with a microscope slide that provides identifying information about the content, origin, or purpose of the slide 110. While not limited, the slide label 124 may include text, alphanumeric codes, barcodes, or visual elements such as colors or symbols, and can convey information such as sample type, patient ID, preparation date, or experimental details. Subsequently, the slide label 124 may be affixed directly to the slide 110 or digitally associated with a corresponding digital slide 118 in a database or software system. As used in this disclosure, “image processing algorithm” is a set of computational procedures or rules designed to analyze, manipulate, or enhance a digital image to extract useful information or improve image quality. In non-limiting examples, an image processing algorithm may perform operations such as filtering, segmenting, feature detection, pattern recognition, or transformation to enable tasks such as object identification, noise reduction, or data extraction from visual input. While not limited to these, the algorithm may be implemented in software, hardware, or a combination of both, and may be applied to various types of image data, including still images or video frames.
[0024] Continuing to refer to Figure 1, in a non-limiting example, determining the slide identifier 122 may include using an image processing algorithm to analyze a scanned image of the slide 110 and extracting metadata 120 from the slide label 124 present on the slide 110. For example, but not limited to, the image processing algorithm could identify and decode a QR code® printed on the slide label 124 and extract metadata 120 such as a unique slide number, patient ID, and preparation date. The metadata 120 can then be used by the processor 102 to determine the slide identifier 122, which can be formatted as "QR20241130-045" to uniquely represent the slide 110 in the database. In another non-limiting example, the slide label 124 may contain human-readable text such as a sample ID or experiment code that can be captured within the image of the digital slide 118. Subsequently, the image processing algorithm could perform optical character recognition (OCR) on the label to extract the metadata 120. For example, though not limited, the algorithm could identify the text "Sample_ABC123" on a label and use it to determine slide identifier 122 "ABC123-01" which links slide 110 to its associated metadata 120 in a database for traceability. Following the above non-limited example, the image processing algorithm could also enhance the scanned image to remove distortion from the slide label 124 or improve contrast before extracting the metadata 120. For example, though not limited, if the slide label 124 is partially obscured by glare or smudges, the image processing algorithm could apply filtering or edge detection techniques to clarify the text or code and ensure accurate extraction of the metadata 120 to determine slide identifier 122.
[0025] Continuing to refer to Figure 1, determining the slide identifier 122 using metadata 120 may include extracting the slide identifier 122 from the electronic health record database 126. Where used in this disclosure, “electronic health record database” is a digital repository designed to store, manage, and facilitate access to electronic health records (EHRs). The electronic health record database 126 may include, but is not limited to, comprehensive and structured information about a patient’s medical history, including demographics, diagnoses, treatment plans, trial results, prescriptions, and other clinical data. Furthermore, the electronic health record database 126 may be implemented to support healthcare providers in clinical decision-making, ensure data interoperability, and maintain the secure and efficient storage and retrieval of patient health information. In a non-limiting example, determining the slide identifier 122 using metadata 120 may include querying the electronic health record database 126 to match the metadata 120 extracted from the slide label 124 with corresponding patient or sample information stored in the database. For example, though not limited, the metadata 120 extracted from slide label 124 may include patient ID "P12345" and biopsy date "2024-11-30". Processor 102 can use this metadata 120 to search the electronic health record database 126 and identify the entry for "Patient ID: P12345" with the corresponding biopsy procedure performed on the same date. Subsequently, the electronic health record database 126 may include a pre-assigned slide identifier 122, such as "BX123-20241130", which processor 102 searches and associates with slide 110. In another non-limiting example, the metadata 120 may include a specimen acceptance number, such as "A67890", linked to a histopathology report in the electronic health record database 126. The processor 102 can query the electronic health record database 126 using the acceptance number and retrieve the associated slide identifier 122 "HIST-A67890-01" to ensure that slide 110 and its digital representation accurately match the patient and procedure details in the database.
[0026] Referring further to Figure 1, at least one processor 102 is configured to locate at least one digital slide 118 using at least one object detection technique 128, and locating at least one digital slide 118 includes identifying a target stained core 130 and aligning the target stained core 130 to a standard format 132. Where used in this disclosure, “object detection technique” is a computational method used to identify and locate specific objects in an image or video. The object detection technique 128 may classify objects and determine their location by analyzing pixel patterns, shapes, textures, or other visual features, such as generating bounding boxes or segmentation masks. The object detection technique 128 may also use conventional computer vision methods such as edge detection or template matching, or algorithms based on advanced machine learning models, including convolutional neural networks (CNNs) or transformer architectures. Where used in this disclosure, “Stained Core of Subject” is a specific area within a tissue sample 116 on a slide 110 that has been treated with a stain to highlight specific cellular or molecular features for analysis. The stained core of subject 130 may correspond to areas of biological importance, such as tumor regions, clusters of immune cells, or specific structures within the tissue. Staining may include, but is not limited to, colorimetric techniques, fluorescence techniques, or other contrast-enhancing techniques to visually distinguish the core and facilitate targeted examination or computational analysis of the highlighted features. Where used in this disclosure, “Standard Format” is a predetermined structured format used to represent, record, or communicate information in a consistent and uniform manner. Where not limited to, Standard Format 132 may include specified fields, layouts, or data entry rules to ensure clarity, comparability, and interoperability across different systems, users, or applications. Standard Format 132 may be implemented physically, such as on paper, or digitally, such as in an electronic template or database schema.In a non-limiting example, at least one processor 102 can analyze at least one digital slide 118 using an object detection technique 128, such as a convolutional neural network, to locate a stained core 130 of interest. Subsequently, the digital slide 118 may contain multiple tissue cores, and the object detection technique 128 can identify the cores by a specific staining intensity or pattern indicating a biological marker of interest, such as HER2 in breast cancer tissue. Once the stained core 130 of interest is identified, the processor 102 can align the core into a standard format 132, such as a rectangular or circular region, to ensure consistent orientation and scaling for subsequent analysis or reporting. In another non-limiting example, the processor 102 can localize the stained core 130 of interest on a digital slide 118 containing multiple immunohistochemical data. Non-limitingly, using an object detection technique 128 based on a segmentation algorithm, the processor 102 can depict the boundaries of the stained cores highlighting the tumor microenvironment. After detecting cores, the processor can rotate, crop, and resize the target stained cores 130 for integration into a machine learning pipeline or comparison across datasets, and align them within a standardized format 132 template, such as a standardized aspect ratio or coordinate grid. In another non-limiting example, the processor 102 can also localize the target stained cores 130 within a tissue microarray digital slide 118. Using object detection techniques 128, the processor 102 can identify stained cores based on a color intensity threshold from a chromogenic stain such as hematoxylin-eosin (H&E). The processor 102 can then center and align the target stained cores 130 within a standard format 132 having predetermined dimensions, facilitating inclusion in an automated scoring system or diagnostic workflow.
[0027] Referring further to Figure 1, in some embodiments, the apparatus 100 may include a machine vision system including at least one camera. The machine vision system can use images from at least one camera to make decisions about the scene, space, and / or objects. For example, the machine vision system may be used for world modeling or alignment of objects in space. In some embodiments, alignment may include, but are not limited to, image processing such as object recognition, feature detection, and edge / corner detection. Non-exclusive examples of feature detection may include scale-invariant feature transform (SIFT), Canny edge detection, and Shi Tomasi corner detection. In some embodiments, alignment may include one or more transformations for oriented the camera frame (or image or video stream) to a three-dimensional coordinate system, and exemplary transformations include, but are not limited to, homography and affine transformations. In one embodiment, the alignment of the first frame to the coordinate system can be verified and / or corrected using object recognition and / or computer vision, as described above. However, initial alignment to two dimensions, for example, represented as alignment to x and y coordinates, may be performed using a two-dimensional projection of a three-dimensional point onto a first frame. A third dimension of alignment, representing depth and / or the z-axis, can be detected by comparing two frames. For example, if the first frame includes a pair of frames captured using a pair of cameras (e.g., stereo cameras, also referred to in this disclosure as stereo cameras), image recognition and / or edge detection software can be used to detect a pair of stereoscopic images of an object, compare the two stereoscopic images to derive z-axis values for points on the object, and, for example, allow the derivation of further z-axis points inside and / or around the object using interpolation. This can be repeated with multiple objects in the field of view, including, but not limited to, environmental features of objects identified by an object classifier and / or indicated by operators.In one embodiment, the x and y axes can be selected to span a common plane and / or the xy-plane of the first frame for two cameras used for stereoscopic image capture, and as a result, the x and y translation components and φ can be pre-inputted into the translation matrix and rotation matrix for the affine transformation of the object's coordinates, as also described above. The initial x and y coordinates and / or estimations in the transformation matrix may be performed alternatively or additionally between the first frame and the second frame, as described above. As described above, for multiple points on the object and / or edges of the object and / or points on multiple edges, the x and y coordinates of the first stereoscopic frame may be input with an initial estimate of the z coordinate based on assumptions about the object, such as the assumption that the ground is substantially parallel to the xy-plane, as selected above. Next, the Z-coordinate and / or x, y, and z-coordinates aligned using the image capture and / or object recognition process described above can be compared with the predicted coordinates using the initial inference in the transformation matrix, and an error function can be calculated by comparing the two sets of points, and new x, y, and / or z-coordinates can be iteratively estimated and compared until the error function falls below a threshold level. In some cases, the machine vision system may use a classifier such as any classifier described throughout this disclosure.
[0028] Continuing to refer to Figure 1, in a non-limiting example, processor 102 may be configured to determine characteristics of slide 110, such as color, hue, or texture, using machine vision technology or machine learning algorithms. Subsequently, the system may include image capture components, such as a camera or sensor, that can collect visual data from the slide. The processor can analyze this data using a trained machine learning model that can classify or quantify attributes such as specific color values, surface patterns, or texture characteristics. This capability can be used to enhance the system's ability to evaluate, classify, or display slide features for various applications such as inventory management, quality control, or user personalization.
[0029] Referring further to Figure 1, at least one processor 102 is configured to evaluate a stained core 130 of interest using a predetermined threshold 134, and the evaluation includes comparing the stained core 130 of interest 130 with a control core 136, generating a validation output 138 as a function of the comparison between the stained core 130 of interest 130 and the control core 136, and transmitting the validation output 138 to a downstream device 140. As used in this disclosure, “predetermined threshold” is a pre-set, specified value or criterion that serves as a reference point for decision-making, comparison, or triggering action. As used in this disclosure, “control core” is a specific tissue sample 116 contained in a tissue microarray used as a reference or standard to ensure the reliability, consistency, and accuracy of the analytical process. The control core 136 may represent known characteristics such as specific staining patterns, antigen expression levels, or cellular composition, and can be used to validate experimental conditions, calibrate instruments, or compare results with a sample of interest. The control core 136 may include, but is not limited to, positive controls, negative controls, or reference tissues selected for their consistent and well-characterized characteristics. Where used in this disclosure, “positive control” is a sample or experimental condition included in an analysis that is expected to yield a known measurable response or result. A positive control may help to verify that a system, process, or assay is functioning correctly and capable of detecting the phenomenon of interest. For example, a positive control tissue may express a target antigen at a known level. Where used in this disclosure, “negative control” is a sample or experimental condition included in an analysis that is not expected to yield a response or result. A negative control may help to identify false positives, ensure specificity, and verify that observed results are due to the intended experimental conditions. For example, a negative control tissue may lack a target antigen. Where used in this disclosure, “reference tissue” is a well-characterized tissue sample used as a benchmark or standard in an experiment or analysis. A reference tissue can provide a consistent baseline for comparison and ensure uniformity and reproducibility across multiple tests or samples.For example, though not limited, the reference tissue may be a standard sample from a healthy organ used for comparison with diseased tissue. Where used in this disclosure, “Validation Output” is a result or dataset generated by a system, process, or algorithm that helps verify the accuracy, reliability, or consistency of a procedure, model, or analysis. The validation output 138 can be compared to a given standard, expected result, or ground truth data to evaluate whether the system or process functions as intended. The validation output 138 may include quantitative metrics, qualitative assessments, or visual representations and can be used to identify errors, optimize performance, or ensure compliance with specified standards. Where used in this disclosure, “Downstream Device” is a device that accesses and interacts with the apparatus 100. For example, though not limited, the downstream device 140 may include a remote device and / or the apparatus 100. In non-limiting embodiments, the downstream device 140 may correspond to a computing device as described throughout this disclosure.
[0030] Continuing to refer to Figure 1, in a non-limiting example, at least one processor 102 can evaluate a stained core 130 of a subject from a digital slide 118 by comparing its staining intensity to a control core 136 using a predetermined threshold 134. For example, in a breast cancer diagnostic workflow, the stained core 130 of the subject might be a tumor biopsy stained for HER2 expression, while the control core 136 might be a tissue sample 116 (positive control) with a known level of HER2 expression. The processor 102 can use an image processing algorithm to measure the stain intensity in both cores and determine whether the stained core 130 of the subject meets a predetermined threshold 134 for HER2 positivity. Based on this comparison, the processor 102 can generate a validation output 138 such as "HER2 positive" or "HER2 negative" and send the validation output 138 to a downstream device 140, such as an electronic health record system or a reporting tool for review by a pathologist. In another non-limiting example, processor 102 can evaluate a stained core 130 of a subject in a tissue microarray study of immune responses, where a predetermined threshold 134 corresponds to the percentage of positively stained immune cells. A control core 136 may be a negative control tissue known to lack the target antigen. Processor 102 can compare the stained core 130 of the subject to the control core 136 to ensure that the staining is specific and not due to background noise or nonspecific binding. The validation output 138 may include a quantified metric such as "Positivity Rate: 75%" and can be sent to a downstream device 140, such as a data visualization platform, for further analysis.
[0031] Continuing to refer to Figure 1, generating a validation output 138 may involve utilizing a stain statistical model 142 configured to calculate at least one predetermined value, which is configured to validate at least one digital slide 118. As used in this disclosure, “stain statistical model” is a computational framework designed to analyze and interpret staining patterns, intensities, or distributions within a tissue sample 116 by applying statistical methods. The stain statistical model 142 may incorporate quantitative metrics such as mean intensity, variance, or percentage of positive areas to evaluate staining characteristics and identify patterns or anomalies. The stain statistical model 142 may be used to classify samples, predict outcomes, or assess consistency across staining experiments, and may be trained on historical data or configured with predetermined thresholds 134 to facilitate standardized, objective analysis. As used in this disclosure, “at least one predetermined value” is a minimum quantity, measurement, or parameter that is pre-established to serve as a reference or criterion in a system, process, or analysis. At least one predetermined value can represent a threshold, baseline, or expected level that must be met or exceeded for a particular action, judgment, or evaluation to occur. At least one predetermined value can be determined based on empirical data, theoretical considerations, or operational requirements, and can be applied to various domains such as image intensity, signal intensity, or statistical metrics.
[0032] Continuing with reference to Figure 1, in a non-limiting example, Stein statistical model 142 may correspond to one or more embodiments of the apparatus described in U.S. Patent Application No. 18 / 602,947, entitled "SYSTEMS AND METHODS FOR INLINE QUALITY CONTROL OF SLIDE DIGITALIZATION," filed March 12, 2024, Agent Reference No. 1519-029USU1, which is incorporated herein by reference in its entirety.
[0033] Continuing with reference to Figure 1, in a non-limiting example, Stein statistical model 142 may correspond to one or more embodiments of the apparatus described in U.S. Patent Application No. 18 / 513,079, entitled “SYSTEM AND METHODS FOR COLOR GAMUT NORMALIZATION FOR PATHOLOGY SLIDES,” filed November 17, 2023, Agent Reference No. 1519-025USU1, which is incorporated herein by reference in its entirety.
[0034] In a non-limiting example, at least one processor 102 can utilize a stain statistics model 142 to calculate at least one predetermined value for a digital slide 118 stained for HER2 expression. The stain statistics model 142 can analyze the color intensity and distribution within the stained core 130 of interest by referencing historical data from multiple positive control slides showing a consistent and validated HER2 staining pattern. Subsequently, using this data, the stain statistics model 142 can calculate thresholds for acceptable staining intensity and uniformity, enabling the processor 102 to determine whether the staining quality meets predetermined quality standards. If the calculated value exceeds the threshold, the processor 102 can validate the digital slide 118 and send it to a pathologist for review. If the value falls below the threshold, the processor 102 can flag the slide 118 for restaining to ensure accuracy and consistency. In another non-limiting example, the stain statistics model 142 may be applied to a digital slide 118 stained for CD8 expression. Where used in this disclosure, “CD8 expression” refers to the production and presence of the CD8 glycoprotein on the surface of CD8+ T cells, which are a subset of cytotoxic T lymphocytes. CD8 expression may include biological markers indicating the activation or abundance of these immune cells in a tissue or biological sample. Measurement of CD8 expression using immunohistochemistry (IHC), flow cytometry, or molecular assays may be used to assess immune responses, characterize the tumor microenvironment, or evaluate the effectiveness of immunotherapy interventions. Following the preceding non-limiting example, a digital slide 118 stained for CD8 expression may show that a given value represents the minimum acceptable percentage of positively stained immune cells within the core of the subject. A stain statistics model 142 can be developed by analyzing data from multiple CD8-positive slides to calculate the percentage of positive areas and compare it to a given threshold 134 for quality assurance. If the calculated percentage meets or exceeds the threshold, the processor 102 can validate the slide 110 and send it to a pathologist.Conversely, if the percentage falls below a threshold, slide 110 may be marked as a quality control failure and routed for restaining or additional processing. In another non-limiting example, processor 102 may perform inline automated quality control by using a stain statistical model 142 to calculate metrics such as staining uniformity and background noise of tissue microarray slides. Where used in this disclosure, “inline automated quality control” is a process that integrates real-time evaluation of data, products, or results into an operational workflow to ensure that quality standards are met without interrupting or delaying the process. Inline automated quality control may use computational algorithms, statistical models, and / or machine learning techniques to evaluate predetermined criteria such as thresholds, patterns, or consistency metrics. In another non-limiting example, inline automated quality control may include analyzing staining quality, detecting artifacts, and / or validating data against a predetermined threshold 134 to determine whether slide 110 is suitable for further analysis or requires corrective action such as restaining. The predetermined values for quality validation may include both an intensity threshold and an acceptable noise level calculated from a positive control. The processor 102 can validate the slide 110 if the staining process achieves both thresholds, and if either metric is outside the acceptable range, it can trigger an automated workflow to re-stain the slide 110, ensuring that high-quality results are delivered to the pathologist. As used in this disclosure, “intensity threshold” is a predetermined value representing the minimum or maximum acceptable level of signal intensity in an image. In non-limiting examples, intensity thresholds can be used to evaluate or segment areas of interest. Non-limitingly, intensity thresholds may refer to luminance, saturation, or optical density of a stained area and be used to determine whether the staining meets specific criteria for quality, positivity, or other metrics.
[0035] Continuing to refer to Figure 1, the stain statistics model 142 can be stain-specific. While not limited to this, the stain statistics model 142 can be stain-specific by being tailored to analyze features specific to the applied stain. For example, when applied to HER2 expression, the stain statistics model 142 can be trained using historical data from multiple HER2-positive control slides. These slides may then exhibit a consistent, validated staining pattern, allowing the stain statistics model 142 to identify specific intensity and distribution patterns associated with HER2. Similarly, for CD8 expression, the stain statistics model 142 can analyze the prevalence of positively stained immune cells by referencing data from CD8-positive control slides. This specificity allows the stain statistics model 142 to set predetermined thresholds tailored to each stain type, ensuring an accurate assessment of staining quality and consistency based on the biological marker of interest.
[0036] Continuing to refer to Figure 1, the determination of the DAB value of the stained core 130 of the subject may involve a series of analytical steps performed by the processor 102. First, the processor 102 can extract color data from the stained core 130 of the subject on the digital slide 118, focusing on metrics such as color intensity, distribution, and uniformity within the stained area. Subsequently, this extracted data may be compared to a reference dataset using the stain statistics model 142. The reference dataset may include, but is not limited to, validated staining patterns and acceptable variability derived from positive control slides adjusted for a specific stain type, such as HER2 or CD8. Based on this comparison, the stain statistics model 142 can calculate a predetermined intensity threshold or acceptable percentage value for positive staining, which may include metrics such as the minimum percentage of positively stained cells or acceptable ranges of staining uniformity and background noise. Subsequently, the processor 102 can calculate the DAB value of the stained core 130 of the subject, representing staining intensity, optical density, brightness, or other quantifiable metrics indicating staining performance. Next, the calculated DAB value can be compared to a predetermined threshold 134. If the value meets or exceeds the predetermined threshold 134, however not limited thereto, slide 110 may be validated and sent to a pathologist for review. Conversely, if the value falls below the predetermined threshold 134, slide 110 may be flagged for restaining or additional processing to ensure that the results meet the required quality and consistency standards. This process for determining the DAB value of the stained core 130 in question is shown in Figures 5A-B.
[0037] Continuing to refer to Figure 1, aligning the stained core 130 of a target may include one or more of the following: rotating at least one digital slide 118, translating at least one digital slide 118, and reflecting at least one digital slide 118. For example, but not limited to, if a tissue sample 116 on a digital slide 118 is scanned at a certain angle, the processor 102 can detect the stained core 130 of a target and apply a rotational transformation to align the stained core 130 of the target vertically or horizontally to facilitate consistent analysis and comparison across slides. In another non-limiting example, aligning the stained core 130 of a target may include translating at least one digital slide 118 to center the stained core 130 of the target within the field of view. For example, if the stained core 130 of a target is located off-center in the scanned image, the processor 102 can adjust the position of the digital slide 118 by shifting it along the x and y axes so that the stained core 130 of the target is optimally positioned within a given frame for analysis. In another non-limiting example, aligning the stained core 130 of the subject may also include reflecting at least one digital slide 118 to correct for mirror orientation. For example, if the digital slide 118 is scanned with the tissue section inverted or mirror-reversed, the processor 102 can identify the stained core 130 of the subject and apply a reflection transformation to restore the correct orientation. This ensures that the stained core 130 of the subject aligns with the reference data or standard format 132, enabling accurate evaluation and downstream processing.
[0038] Continuing with reference to Figure 1, the apparatus may be further configured to align two or more stained cores 112, including a mirrored configuration 144, to a standard form 132 by using metadata 120 to identify a target stained core 130, using at least one handling mirror 146, and superimposing a verification control 148 onto the target stained core 130. As used in this disclosure, “mirrored configuration” is an arrangement in which the orientation of an object, image, or structure is inverted along a specified axis to create a reflection of the original. A mirrored configuration 144 may occur when the image on the slide 110 is inverted horizontally, vertically, or both, thereby causing the position or orientation of features to appear inverted relative to their actual arrangement. Correcting a mirrored configuration 144 may include applying image transformations such as reflection or inversion to restore the original orientation for accurate analysis or comparison. As used in this disclosure, “handling mirror” is a reflective device or reflective surface configured to assist in manipulating, positioning, or aligning an object by providing a visual reflection. The handling mirror 146 can be used to enable precise adjustment or observation, particularly in environments where direct access to or visibility of an object is limited. Where used in this disclosure, “validation control” is a reference or benchmark used to verify the accuracy, consistency, or reliability of a process, system, or output. The validation control 148 may include predetermined parameters, known standards, or reference samples that are evaluated along with the main subject of the analysis to ensure that the system or process operates as intended. In the context of a staining or imaging workflow, the validation control 148 may include a control core 136, a predetermined threshold 134, and / or a statistical model used to verify the validity of the results before proceeding with further analysis or decision-making. In a non-limiting example, the instrument may align two or more stained cores 112, including a mirror image configuration 144, into a standard format 132 during visualization of a tissue microarray for quality control (QC). The instrument may first identify the stained core 130 of interest using metadata 120, such as a unique identifier for the core in the TMA grid or its spatial location.For example, metadata 120 may specify the row and column position of a particular core corresponding to a HER2-positive sample. Subsequently, to address the mirror image configuration 144, the device can use a handling mirror 146 to virtually reflect the stained core 130 of the subject, ensuring that its orientation matches the expected standard format 132. For example, if the mirror image configuration 144 positions the upper right corner of the stained core to the lower left, the handling mirror 146 can reverse the core horizontally and vertically to restore its correct alignment. The device can then overlay a validation control 148 onto the stained core 130 of the subject. This may include directly overlaying QC values, such as a staining intensity metric or positive area percentage, onto a digital image of the stained core. Additionally and / or alternatively, thresholds derived from a stain statistics model 142 may be displayed along with QC values, such as color-coded bars, indicating whether the stained core meets a predetermined threshold 134 for acceptable staining quality.
[0039] Continuing to refer to Figure 1, aligning two or more stained cores 112, which may include a mirror image configuration 144, into a standard format 132 may involve using the texture 150, deconvoluted channels 152, and at least one stain 154 of the two or more stained cores 112. As used in this disclosure, “texture” is a property of an image or material that describes the spatial arrangement, pattern, or variation in intensity or color of its surface. Texture 150 may include features such as smoothness, roughness, grain size, or repeatability of a particular structure within a subject area. Texture 150 may, but is not limited to, be quantified using computational techniques that evaluate properties such as contrast, correlation, or frequency components to help identify or classify objects based on their visual patterns. As used in this disclosure, “deconvoluted channels” is a processed representation of an image that isolates specific components or signals corresponding to individual stains or markers by separating overlapping color or spectral data. Deconvolution techniques can be applied to distinguish stains, such as removing noise or separating DAB (diaminobenzidine) and hematoxylin signals in a stained tissue sample 116. The deconvoluted channels 152 allow for a more accurate analysis of the contribution of specific stains to the image and facilitate targeted evaluation of biomarkers or cellular features. As used in this disclosure, “stain” is a chemical or biological agent applied to a tissue sample 116 to enhance the visibility of a particular structure, cell, or molecule under a microscope. At least one stain 154 may bind to a specific component of the tissue, such as a protein, nucleic acid, or membrane, to produce a different color or fluorescence. Examples of at least one stain 154 may include hematoxylin for the nucleus, eosin for the cytoplasm, and DAB for immunohistochemical detection of an antigen. Stains 154 can be used to highlight features of a subject for diagnostic, research, or quality control purposes.In a non-limiting example, aligning two or more stained cores 112 within a mirror image configuration 144 to a standard format 132 may involve analyzing the core texture 150 and utilizing deconvoluted channels 152 for accurate orientation. For example, in a tissue microarray where cores are symmetrically arranged, mirror image configurations 144 may occur due to scanning errors. The processor 102 may evaluate the texture 150 pattern of surrounding cores, such as glandular or stromal arrangement, to identify mirror image cores. Additionally and / or alternatively, the processor 102 may apply color deconvolution to isolate DAB channels, thereby highlighting target stains and enabling robust detection of positive mirror image configurations 144 and appropriate reorientation. In another non-limiting example, the processor 102 may use both the texture 150 and deconvoluted DAB and hematoxylin channels to align stained cores. In the case of cores stained with a biomarker such as HER2, DAB channels may show positivity in the cell membrane, while hematoxylin channels provide structural reference by highlighting the nucleus. Processor 102 can analyze these features to reorient the mirrored cores and align them with standard format 132 for subsequent analysis, ensuring that the core orientation matches the correct biological interpretation. In another non-limiting example, processor 102 can perform high-magnification analysis on cores identified as positive to eliminate false positives. For example, but not limiting, the texture 150 of the core may initially suggest positivity due to staining artifacts or background noise. Processor 102 can analyze cell membrane, cytoplasm, and nuclear staining within the deconvoluted DAB and hematoxylin channels to confirm true biomarker expression. If positivity is found to be due to nonspecific staining, processor 102 can flag the core for quality control, ensuring accurate reporting and reducing diagnostic errors.
[0040] Continuing to refer to Figure 1, the time data 156 associated with at least one slide 110 may be used to determine which preprocessing step 158 generated a failure 162 associated with at least one slide 110 by organizing the time data 160 associated with at least one slide 110 in chronological order, organizing the preprocessing steps 158 corresponding to the time data 160 in chronological order, and using the time data 156 to identify the preprocessing step 158 corresponding to the failure 162. As used in this disclosure, “time data” is information that specifies a time-related attribute or event associated with an object, process, or system. In a non-limiting example, the time data 156 may include a timestamp, duration, or sequence indicating when a particular action, such as staining, scanning, or data transfer, occurred. As used in this disclosure, “preprocessing step” is an action or procedure performed on an object or data before its primary analysis or use. For example, but not limited to, preprocessing step 158 may include activities such as tissue sectioning, staining, slide labeling, scanning, or image enhancement. While not limited, preprocessing step 158 may aim to prepare slide 110 or its digital representation for a downstream process. Where used in this disclosure, “failure” is a deviation from expected behavior, quality, or performance in a system, process, or subject. While not limited, failure 162 may result from an error, defect, or failure in operation and may manifest as artifacts, incorrect metadata 120, or improper staining quality in the digital slide 118. In a non-limiting example, time data 156 associated with at least one slide 110 may include a timestamp marking the completion of the staining process. Subsequently, using the time data 156, it is possible to determine which preprocessing step 158 generated the failure 162 by organizing all time data 160 associated with slide 110, such as timestamps for tissue sectioning, staining, and scanning. The processor 102 can chronologically arrange these points of time data 160 together with the corresponding preprocessing step 158.For example, if defect 162 appears as irregular staining intensity on the digital slide 118, the processor 102 can match the timestamp of the staining process with the observed defect 162. By correlating defect 162 with the staining pretreatment step 158, the apparatus can flag the staining operation as the cause of the problem, enabling targeted corrective actions.
[0041] Continuing to refer to Figure 1, the verification output 138 is further configured to cross-validate a positive control core 164, the cross-validation comprising configuring one or more optical device parameters 166 of the optical device 106 as a function of the stained core 130 of the subject and the positive control core 164, scanning the stained core 130 of the subject at a certain magnified resolution 168 using at least one optical device 106, comparing the stained core 130 of the subject to an approved value, and generating a cross-validation output 170. Where used in this disclosure, “positive control core” is a reference component or standard within a system, process, or experimental setting designed to consistently exhibit known or expected positive results. The positive control core 164 may serve as a benchmark for verifying the functionality, reliability, or accuracy of a primary element being tested or analyzed within the system. Subsequently, the positive control core 164 may serve as a benchmark against which the stained core 130 of the subject is compared during evaluation. Where used in this disclosure, “optical device parameter” refers to a configurable setting or attribute of an optical device that determines its performance or output during scanning or imaging of an object. In non-limiting examples, optical device parameters 166 may include resolution, magnification level, focal plane, light intensity, or scanning speed. Subsequently, by adjusting the optical device parameters 166, accurate and high-quality imaging tailored to specific attributes of the stained core can be ensured. Where used in this disclosure, “magnified resolution” refers to the level of detail captured in the image as a result of increasing the optical or digital magnification during scanning or imaging. In non-limiting examples, magnified resolution 168 may enable detailed visualization of cellular structures or intracellular structures such as membranes, cytoplasm, or nuclei, and enable accurate assessment of staining quality or biomarker expression. Where used in this disclosure, “cross-validation output” refers to the result produced by re-evaluating or validating the validation output using additional data, alternative methods, or improved criteria. In non-limiting examples, the cross-validation output 170 can be used to ensure the consistency and reliability of the validation process by incorporating comparisons with high-resolution scanning, optimized optical device parameters 166, or approved values 164 to guarantee accurate conclusions.In a non-limiting example, further configuring the validation output 138 to cross-validate may include adjusting optical device parameters 166, such as increasing the magnification level and optimizing the focal plane, to scan the stained core 130 of the target at a magnified resolution 168. For example, if the validation output 138 for a HER2-stained tissue sample 116 shows border positivity based on a low-resolution scan, the processor 102 can reconfigure the optical device 106 to scan the stained core 130 of the target at a magnification of 40x. The resulting high-resolution image can reveal finer details of cell membrane staining, allowing the processor 102 to compare the findings with approved values, such as a predetermined threshold 134 for HER2 expression. A cross-validation output 170, such as "HER2 positivity confirmed" or "re-validation required," is then generated and can be sent to a downstream device 140 for further action. In another non-limiting example, cross-validation may include using optical device parameters 166 to improve brightness and contrast settings while scanning the stained core 130 of the target at a magnified resolution 168. For TMA cores stained with the CD8 immunocytoplasmic marker, the optical device 106 may be configured to capture detailed cytoplasmic and nuclear features, enabling accurate quantification of positively stained cells. The processor 102 can then compare these detailed findings to approved values, such as a threshold for CD8 positivity. Cross-validation output 170, such as "Positivity: 85%, validated," can ensure that the validation process is consistently accurate. In another non-limiting example, a stained core 130 of an object flagged as potentially overstained during initial validation may be re-evaluated by cross-validation. The processor 102 can configure the optical device 106 to scan the core at multiple depths of focus, ensuring a comprehensive analysis of the stained layer. Scans at magnified resolution 168 can then be compared to approved values 164 for acceptable staining intensity and uniformity. Based on the findings, cross-validation output 170, such as "Overstaining confirmed" or "Staining within acceptable limits," can be generated to ensure the integrity of the QC process.
[0042] Continuing to refer to Figure 1, in non-limiting examples, the expected results from performing staining quality control on a control core 136 and a target stained core may include several advantages. Batch controls can assist technicians operating the staining machine in evaluating the accuracy of the pretreatment step 158, while slide-by-slide controls can provide additional assurance to both technicians and pathologists. For example, a pathologist can evaluate slide 110 with greater confidence when the control core 136 is stained together with the tissue sample 116 on the same slide. The staining QC process can also provide a quantitative evaluation of the staining process, allowing technicians to independently determine whether slide 110 requires restaining. Additionally and / or alternatively, visualization of tissue microarrays for QC can be streamlined through TMA orientation correction, the use of a handling mirror 146, and the overlay of a verification output 138 onto a target stained core 130 paired with at least one predetermined value. Using a stain statistical model 142, it is possible to calculate at least one predetermined value and enable automated QC decisions, such as deciding whether to send the digital slide 118 to the pathologist or flag it for restaining. By examining the stained digital slide 118 across various pre-treatment steps 158, stains, and histological types, laboratory managers can leverage time data 160 to identify potential process improvements. Once the technician approves the digital slide 118 as positively stained, the pathologist can verify the accuracy of the stain by examining the verification control 148 of the control core 136 at a higher magnification, such as 40x. High-magnification scanning of the stained core 130 of the target can further assist in analyzing features such as cell membranes, cytoplasm, and nuclei, helping the pathologist identify and eliminate defects 162, thereby improving diagnostic accuracy.
[0043] Exemplary embodiments are disclosed above and shown in the accompanying drawings. Those skilled in the art will understand that various modifications, omissions, and additions can be made to those specifically disclosed herein without departing from the spirit and scope of the invention.
[0044] Referring here to Figure 2A, an illustrative Figure 200a of a control tissue microarray per slide is shown. In one embodiment, Figure 200a includes slide 202. In one embodiment, Figure 200a includes label 204. In one embodiment, Figure 200a includes tissue sample. In one embodiment, Figure 200a includes positive control core 212. In one embodiment, Figure 200a includes negative control core 216. In one embodiment, each prepared tissue sample 208 may be placed on slide 202 together with the corresponding positive control core 212 and negative control core 216. In one embodiment, the positive control core 212 and negative control core 216 may be strategically included to assess the quality of the slide preparation and staining process. In one embodiment, the positive control core 212 can confirm the presence of target staining under optimized conditions, while the negative control core 216 can ensure specificity by verifying the absence of nonspecific staining. In one embodiment, Figure 200a can demonstrate that the number of cores included per slide may vary depending on experimental or diagnostic requirements, allowing for flexibility in the design of the control configuration. In one embodiment, the setup allows for accurate evaluation of staining quality and slide preparation, ensuring reliable interpretation of the results.
[0045] Referring here to Figure 2B, an illustrative figure 200b of a batch control tissue microarray is shown. In one embodiment, Figure 200b includes a slide 202. In one embodiment, Figure 200b includes a label 204. In one embodiment, Figure 200b includes a plurality of control cores 220 arranged to represent a batch control configuration. In one embodiment, the plurality of control cores 220 may include a set of tissue types and staining variations representing the batch being processed. In one embodiment, the batch control tissue microarray may include five control cores 220, but the number of cores may vary depending on the experimental or diagnostic needs. In one embodiment, the plurality of control cores 220 can be used to evaluate the overall consistency and accuracy of the staining process across a batch of slides. In one embodiment, the batch control setup can account for variations in the staining process by comparing results from different cores within the same batch. In one embodiment, the batch control tissue microarray can account for inherent variations in tissue types and ensure that the staining results are accurate and reproducible for all samples in the batch. In one embodiment, including a batch control slide can provide technicians with a robust means of evaluating the performance of the staining machine. In one embodiment, since batch control slides ensure that the overall process quality is verified, pathologists can indirectly benefit by having confidence in the consistency of the staining process across the entire batch when examining individual slides.
[0046] Referring here to Figure 3A, an illustrative Figure 300a of a control tissue microarray having one labeled core is shown. In one embodiment, Figure 300a includes a slide 302. In one embodiment, Figure 300a includes a label 304 affixed to the slide 302 for identification. In one embodiment, Figure 300a includes a core 308 positioned on the slide 302.
[0047] Referring here to Figure 3B, an illustrative Figure 300b of a control tissue microarray having two cores of a 1×2 matrix array is shown. In one embodiment, Figure 300b includes a slide 302. In one embodiment, Figure 300b includes a label 304 affixed to the slide 302 for identification. In one embodiment, Figure 300b includes two cores of a 1×2 matrix array 308 arranged on the slide 302.
[0048] Referring here to Figure 3C, an illustrative Figure 300c shows a control tissue microarray having five cores of a 4×1 matrix and a 1 matrix array. In one embodiment, Figure 300c includes a slide 302. In one embodiment, Figure 300c includes a label 304 affixed to the slide 302 for identification. In one embodiment, Figure 300c includes five cores of a 4×1 matrix and a 1 matrix array 308 arranged on the slide 302.
[0049] Referring here to Figure 3D, an illustrative Figure 300d shows a control tissue microarray having nine cores of a 3x3 matrix array. In one embodiment, Figure 300d includes a slide 302. In one embodiment, Figure 300d includes a label 304 affixed to the slide 302 for identification. In one embodiment, Figure 300d includes nine cores of a 3x3 matrix array 308 arranged on the slide 302.
[0050] Referring here to Figure 3E, an illustrative Figure 300e of a control tissue microarray having 10 cores of a 2×5 matrix array is shown. In one embodiment, Figure 300e includes a slide 302. In one embodiment, Figure 300e includes a label 304 affixed to the slide 302 for identification. In one embodiment, Figure 300e includes 10 cores of a 2×5 matrix array 308 arranged on the slide 302.
[0051] Referring here to Figure 3F, an illustrative Figure 300f of a control tissue microarray having five cores of a 1×5 matrix array is shown. In one embodiment, Figure 300f includes slide 302. In one embodiment, Figure 300f includes label 304 affixed to slide 302 for identification. In one embodiment, Figure 300f includes five cores of a 1×5 matrix array 308 arranged on slide 302.
[0052] Referring here to Figure 4(A), an illustrative Figure 400a of a 2x4 tissue microarray having a target core at position 3 for a given stain is shown. In one embodiment, Figure 400a includes a target core 404 at position 3 for a given stain.
[0053] Referring here to Figure 4(B), an example figure 400b of a mirror image configuration along the vertical axis of a 2x4 tissue microarray is shown, where the target core appears in row 1, column 2. In one embodiment, Figure 400b includes the target core 404 appearing in row 1, column 2. In one embodiment, Figure 400b shows spatial displacement along the vertical axis 408.
[0054] Referring here to Figure 4(C), an example figure 400c is shown of a mirror image configuration along the horizontal axis of a 2x4 tissue microarray, where the target core appears in row 2, column 3. In one embodiment, Figure 400c includes the target core 404 appearing in row 2, column 3. In one embodiment, Figure 400c shows spatial displacement along the horizontal axis 412.
[0055] Referring here to Figure 4(D), an illustrative figure 400d shows how mirror configurations are detected using the texture and color of specific cores. In one embodiment, Figure 400d includes core 404 of interest. In one embodiment, Figure 400d includes multiple cores, with specific cores such as C1, C4, C5, and C8 being used to identify mirror configurations. In one embodiment, the detection process can analyze the texture patterns of these cores, such as structural or spatial features, along with their color characteristics, to determine whether the cores are in a mirror configuration. In one embodiment, this identification can facilitate correction of mirrored orientations and ensure proper alignment of tissue samples for accurate downstream analysis.
[0056] Referring here to Figure 5A, illustrative Figure 500a shows a workflow for calculating DAB values for each stain. In one embodiment, Figure 500a includes identifying a control tissue microarray 508 from the entire slide image 504. In one embodiment, the control TMA 508 is extracted 512. In one embodiment, the control TMA 508 is oriented to a standard format 516. In one embodiment, Figure 500a includes performing mirror correction 520 to correctly align the TMA cores. In one embodiment, the DAB value of the TMA core identified by core number 524 is calculated for each stain. In one embodiment, the DAB values are collected across multiple positive slides for the target core 528 corresponding to a particular stain. For example, but not limited to, the DAB value for core C2 can be collected across multiple slides for a particular stain 532. In one embodiment, the workflow includes calculating the mean and standard deviation of the collected DAB values as part of a statistical modeling process 536.
[0057] Referring here to Figure 5B, illustrative Figure 500b shows a statistical model for multiple stains. In one embodiment, Figure 500b includes a vertical axis of immunohistochemistry slide number 540. In one embodiment, Figure 500b includes a horizontal axis of mean DAG for stain-specific cores 542. In one embodiment, Figure 500b includes a graphical representation of the DAB values for each stain, shown as Gaussian curves 544, 546, and 548. In one embodiment, the mean and standard deviation 554, 552, and 550 for each stain are calculated and maintained separately within the model. In one embodiment, the statistical model uses the mean and standard deviation for a given stain to determine whether a new slide is positive for that stain. In one embodiment, this determination is based on a threshold calculated using the formula: threshold = mean ± (2 × standard deviation).
[0058] Referring here to Figure 6, illustrative Figure 600a illustrates a workflow for quality control of new slides. In one embodiment, Figure 600 includes a process for comparing the DAB value of a stained tissue microarray core for a particular stain 604 with a threshold 608 associated with each stain. In one embodiment, if the DAB value for a stain exceeds the threshold 608, the TMA core is considered positive 612a. In one embodiment, if the DAB value for a stain is less than the threshold 608, the TMA core is considered negative 612b. In one embodiment, the positive core 612a is then evaluated at magnified resolution to assess for false positives. In one embodiment, evaluation at higher magnification may include matching the staining location on specific cellular structures such as the cytoplasm 620a, cell membrane 620b, and nucleus 620c, based on the type of stain 616. In one embodiment, the determination of false positives may ensure accurate validation of the TMA core and contribute to a reliable interpretation of the staining results.
[0059] Referring here to Figure 7, a flowchart of an exemplary method 700 for automatically verifying quality data associated with at least one slide is shown. In step 705, method 700 includes receiving at least one digital slide corresponding to at least one slide from at least one optical device. This can be carried out as described with reference to Figures 1 to 6.
[0060] Referring further to Figure 7, in step 710, method 700 includes determining slide identification using metadata. This can be carried out as described with reference to Figures 1 to 6.
[0061] Referring further to Figure 7, in step 715, method 700 includes locating at least one digital slide using at least one object detection technique, and locating at least one digital slide includes identifying the stained core of the object and aligning the stained core of the object to a standard format. This can be carried out as described with reference to Figures 1 to 6.
[0062] Referring further to Figure 7, in step 720, method 700 includes evaluating the stained core of a target using a predetermined threshold, the evaluation including comparing the stained core of the target with a control core, generating a verification output as a function of the comparison between the stained core of the target and the control core, and transmitting the verification output to a downstream device. This can be carried out as described with reference to Figures 1 to 6.
[0063] Referring here to Figure 8, an exemplary embodiment of a machine learning module 800 capable of performing one or more machine learning processes as described in this disclosure is shown. The machine learning module may use the machine learning processes to perform decision, classification, and / or analysis steps, methods, processes, etc., as described in this disclosure. A “machine learning process,” as used in this disclosure, is a process that automatically uses training data 804 to generate an algorithm instantiated with hardware or software logic, data structures, and / or functions, which are executed by a computing device / module to produce an output 808 given data provided as input 812, in contrast to a non-machine learning software program, in which the commands to be executed are predetermined by the user and written in a programming language.
[0064] Referring further to Figure 8, “training data,” as used herein, is data containing correlations that a machine learning process can use to model relationships between two or more categories of data elements. For example, but not limited to, training data 804 may contain multiple data entries, also known as “training examples,” each entry representing a set of data elements recorded, received, and / or generated together, and the data elements may be correlated by the presence of commons in a given data entry, by proximity in a given data entry, etc. Multiple data entries within training data 804 may reveal one or more tendencies of correlation between categories of data elements, for example, but not limited to, higher values of a first data element belonging to a first category of data elements may tend to correlate with higher values of a second data element belonging to a second category of data elements, and may show a potential proportional relationship or other mathematical relationship linking values belonging to two categories. Multiple categories of data elements can be related in the training data 804 according to various correlations, which may indicate causal and / or predictive links between categories of data elements, and which can be modeled as relationships such as mathematical relationships by a machine learning process, as will be described in more detail below. The training data 804 can be formatted and / or organized by categories of data elements, for example, by associating data elements with one or more descriptors corresponding to categories of data elements. As a non-limiting example, the training data 804 may include data entered in a standardized format by a person or process such that entries of a given data element in a given field in a form can be mapped to one or more descriptors of a category.Elements within the training data 804 may be linked to category descriptors by tags, tokens, or other data elements, for example, the training data 804 may be provided in a fixed-length format, a format that links the data location to categories such as Comma-Separated Value (CSV) format, and / or a self-describing format such as an extensible markup language (XML), JavaScript® Object Notation (JSON), enabling a process or device to discover the categories of the data.
[0065] Alternatively or additionally, continuing to refer to Figure 8, the training data 804 may include one or more unclassified elements; that is, the training data 804 may not be formatted or may not include descriptors for some elements of the data. Machine learning algorithms and / or other processes can sort the training data 804 according to one or more classifications, for example, using natural language processing algorithms, tokenization, or detection of correlation values in the raw data, and the categories can be generated using correlation and / or other processing algorithms. As a non-limiting example, in a text corpus, phrases constituting a number "n" of compound words, such as nouns modified by other nouns, may be identified according to the statistically significant prevalence of n-grams containing such words in a particular order, and such n-grams may be classified as linguistic elements such as "words," which will be tracked as well as single words, and new categories can be generated as a result of statistical analysis. Similarly, in data entries containing some text data, a person's name may be identified by referencing a list, dictionary, or other list of terms, which may enable ad-hoc classification by machine learning algorithms and / or automated association of data within a data entry with descriptors or to a given format. The ability to automatically classify data entries may make the same training data 804 applicable to two or more separate machine learning algorithms, as will be described in more detail below. The training data 804 used by the machine learning module 800 may correlate any input data as described in this disclosure to any output data as described in this disclosure. As a non-limiting illustrative example, the input may include at least one slide 110, and the output may include properties of at least one slide 110, such as texture, hue, etc.
[0066] Referring further to Figure 8, one or more supervised and / or unsupervised machine learning processes and / or models can be used to filter, sort, and / or select training data, and such models may include, but are not limited to, a training data classifier 816. The training data classifier 816 may include a machine learning model, such as a “classifier,” which, when used in this disclosure, represents and / or uses a data structure, such as a mathematical model, neural network, or program generated by a machine learning algorithm known as a “classification algorithm,” as defined below, which sorts the input into categories or bins of data and outputs categories or bins of data and / or labels associated therewith. The classifier may be configured to output at least one data set that labels or otherwise identifies, such as datasets that have been clustered together and found to be close under a distance metric, as described below. The distance metric may include, but are not limited to, any norm, such as the Pythagorean norm. The machine learning module 800 may generate a classifier using a classification algorithm defined as a process by which a computing device and / or any module and / or component operating therein derives a classifier from the training data 804. Classification may be performed using, but not limited to, linear classifiers such as logistic regression classifiers and / or naive Bayesian classifiers, nearest neighbor classifiers such as k-nearest neighbor classifiers, support vector machines, least squares support vector machines, Fisher linear discriminant, quadratic classifiers, decision trees, boosted trees, random forest classifiers, learning vector quantization, and / or neural network-based classifiers. As an unrestricted example, the training data classifier 816 may classify elements of the training data into categories of textures.
[0067] Referring further to Figure 8, a computing device may be configured to generate a classifier using a naive Bayes classification algorithm. The naive Bayes classification algorithm generates a classifier by assigning class labels to problem instances, which are represented as vectors of element values. The class labels are drawn from a finite set. The naive Bayes classification algorithm may involve generating a family of algorithms that, given class variables, assume that the values of certain elements are independent of the values of any other elements. The naive Bayes classification algorithm is derived from Bayes' theorem, expressed as P(A / B) = P(B / A)P(A)÷P(B), where P(A / B) is the probability of hypothesis A given data B, also known as the posterior probability; P(B / A) is the probability of data B assuming hypothesis A is true; P(A) is the probability that hypothesis A is true regardless of the data, also known as the prior probability of A; and P(B) is the probability of the data unaffected by the hypothesis. The naive Bayes algorithm can be generated by first converting the training data into a frequency table. Next, the computing device can compute a likelihood table by calculating the probabilities of different data entries and classification labels. The computing device can use the naive Bayes equation to compute the posterior probability of each class. The class with the highest posterior probability is the prediction result. Naive Bayes classification algorithms may include Gaussian models that follow a normal distribution. Naive Bayes classification algorithms may include multinomial models used for discrete counts. Naive Bayes classification algorithms may include Bernoulli models that can be used when the vectors are binary.
[0068] Continuing to refer to Figure 8, the computing device may be configured to generate a classifier using the K-nearest neighbor (KNN) algorithm. The “K-nearest neighbor algorithm,” as used in this disclosure, includes a classification method that leverages feature similarity to analyze how similar out-of-sample features are to the training data and classifies the input data into one or more clusters and / or categories of features as represented in the training data, which may be performed by representing both the training and input data in vector form and using one or more measures of vector similarity to identify classifications in the training data and determine the classification of the input data. The K-nearest neighbor algorithm may include specifying a K value, or a number that instructs the classifier to select the k most similar entries for a given sample; determining the most common classifier for entries in a database; and classifying known samples, which may be performed recursively and / or iteratively to generate classifiers that can be used to classify the input data as further samples. For example, an initial set of samples may be run to cover initial heuristics and / or “first inferences” in the output and / or relationships, which may be seeded with expert inputs received according to any process as described herein, but not limited to such processes. As a non-limiting example, the initial heuristics may include ranking the associations between input and training data elements. The heuristics may include selecting some of the highest-ranking associations and / or training data elements.
[0069] Continuing to refer to Figure 8, the k-nearest neighbor algorithm generates a first vector output containing the data entry cluster, a second vector output containing the input data, and the distance between the first and second vector outputs can be calculated using any appropriate norm, such as cosine similarity or Euclidean distance measure. Each vector output can be represented as an n-tuple of values, where n is at least two values. Each value in the n-tuple of values can represent a measured or other quantitative value associated with a given category or attribute of the data, examples of which are provided in further detail below. The vectors can be represented in n-dimensional space using a per-category axis of the values represented in the n-tuple of values, such that the vectors have a geometric direction that characterizes the relative amounts of the attributes in the n-tuple compared to each other, although this is not limited to the vectors. Two vectors can be considered equivalent if their directions and / or the relative amounts of the values in each vector compared to each other are the same. Thus, as a non-restrictive example, a vector represented as [5,10,15] can be treated as equivalent to a vector represented as [1,2,3] for the purposes of this disclosure. Vectors may be more similar if their directions are more similar, and more different if their directions are more diverse, but vector similarity may alternatively or additionally be determined using the average of similarities between similar attributes, or any other measure of similarity suitable for any n tuple of values, or an aggregation of numerical similarity measures for the purpose of a loss function, as described in more detail below. Any vectors described herein can be scaled so that each vector represents each attribute along an equivalent scale of values. Each vector may be “normalized,” or Pythagorean norm
number
[0070] Referring further to Figure 8, training examples for use as training data may be selected from a population of potential examples according to a cohort related to the analytical problem or classification task to be solved. Alternatively or additionally, training data may be selected to span a set of possible situations or inputs for the machine learning model and / or process encountered during deployment. For example, for each category of input data to a machine learning process or model that may exist within a range of values in a set of phenomena such as images, user data, process data, and physical data, the computing device, processor, and / or machine learning model may select training examples that represent each possible value and / or a representative sample of values in such a range. The selection of representative samples may include, for example, selecting training examples in proportion to a statistically determined and / or predicted distribution of such values according to relative frequency, such that values that are encountered more frequently in the population of data thus analyzed are represented by more training examples than values that are encountered less frequently. Alternatively or additionally, the set of training examples may be compared against and / or presented to the user a set of representative values in a database so that the process can automatically or via user input detect one or more values that are not included in the set of training examples. Computing devices, processors, and / or modules can automatically generate missing training examples, which may be done by receiving and / or retrieving missing input and / or output values and correlating the missing input and / or output values with the retrieved values and corresponding output and / or input values collated within the data record, provided by the user and / or other devices, etc.
[0071] Continuing to refer to Figure 8, the computer, processor, and / or module may be configured to preprocess the training data. “Preprocessing” the training data, as used in this disclosure, means converting the training data from its raw form into a format that can be used to train a machine learning model. Preprocessing may include sanitization, feature selection, feature scaling, data augmentation, and the like.
[0072] Referring further to Figure 8, a computer, processor, and / or module may be configured to sanitize training data. “Sanitizing training data,” as used in this disclosure, is the process of removing training examples that would prevent a machine learning model and / or process from converging to a useful outcome. For example, but not limited to, training examples may include input and / or output values that are outliers from values typically encountered, such as values exceeding a threshold number of standard deviations from the mean, mean, or expected value, so that a machine learning algorithm using the training examples can adapt to less likely quantities as input and / or output. Alternatively or additionally, one or more training examples may be identified as having low-quality data, where “low-quality” is defined as having a signal-to-noise ratio below a threshold. Sanitization may include steps such as removing duplicate or other redundant data, interpolating missing data, correcting data errors, standardizing data, and identifying outliers. In a non-limiting example, sanitization may include utilizing an algorithm for identifying duplicate entries or a spell-checking algorithm.
[0073] As a non-limiting example, and referring further to Figure 8, images used to train an image classifier or other machine learning models and / or processes that take images as input or produce images as output may be rejected if their image quality falls below a threshold. For example, but not limited to, computing devices, processors, and / or modules may perform blur detection and reject one or more blur detections, as a non-limiting example, by performing an approximation such as a Fourier transform or Fast Fourier transform (FFT) of the image and analyzing the distribution of low and high frequencies in the resulting frequency domain depiction of the image, where the number of high-frequency values below a threshold level may indicate blur. As a further non-limiting example, blur detection may be performed by convolving the image or the channels of the image etc. with a Laplacian kernel, which may generate a numerical score that reflects some rapid changes in intensity shown in the image, such that a high score indicates clarity and a low score indicates blur. Blur detection can be performed using gradient-based operators that measure the operator based on the gradient or first derivative of an image, based on the hypothesis that abrupt changes indicate sharp edges in an image and therefore indicate a lower degree of blur. Blur detection can be performed using wavelet-based operators that take advantage of the ability of discrete wavelet transform coefficients to describe the frequency and spatial content of an image. Blur detection can be performed using statistics-based operators that take advantage of several image statistics as texture descriptors to calculate the focus level. Blur detection can be performed by using discrete cosine transform (DCT) coefficients to calculate the focus level of an image from its frequency components.
[0074] Continuing to refer to Figure 8, the computing device, processor, and / or module may be configured to precondition one or more training examples. For example, if a machine learning model and / or process has one or more inputs and / or outputs that transmit or receive, requiring a certain number of bits, samples, or other units of data, then the elements of one or more training examples that will be used as inputs and / or outputs, or compared to them, can be modified to have such a number of data units. For example, the computing device, processor, and / or module may convert a smaller number of units, such as in a low-pixel-count image, into a desired number of units, for example, by upsampling and interpolation. As a non-limiting example, a low-pixel-count image may have 100 pixels, but the desired number of pixels may be 128. The processor can interpolate the low-pixel-count image to convert 100 pixels to 128 pixels. Those skilled in the art should be noted that by reading this disclosure, they will know various methods for interpolating a smaller number of data units, such as samples, pixels, or bits, into a desired number of such units. In some cases, the set of interpolation rules may be trained by a neural network or other machine learning model trained to predict interpolated pixel values using training data, along with very detailed inputs and / or outputs, as well as corresponding sets of inputs and / or outputs downsampled to fewer units. As a non-limiting example, sample inputs and / or outputs, such as a sample picture with sample-extended data units (e.g., pixels added between the original pixels), may be input to a neural network or machine learning model and output a pseudo-replica sample picture with dummy values assigned to the pixels between the original pixels based on the set of interpolation rules.As a non-limiting example, in the context of an image classifier, a machine learning model may have a set of interpolation rules trained on a set of very detailed images and images downsampled to fewer pixels, and a neural network or other machine learning model trained using those examples to predict interpolated pixel values in a face picture context. As a result, an input with sample-expanded data units (added between the original data units, with dummy values) may pass through the trained neural network and / or model, which may be input values to replace the dummy values. Alternatively or additionally, a processor, computing device, and / or module may utilize a sample expander method, a low-pass filter, or both. As used in this disclosure, a “low-pass filter” is a filter that allows signals below a selected cutoff frequency to pass through and attenuates signals above a cutoff frequency. The exact frequency response of the filter depends on the filter design. A computing device, processor, and / or module may use averaging, such as luma or chroma averaging, in the image to fill in the data units between the original data units.
[0075] In some embodiments, continuing with reference to Figure 8, a computing device, processor, and / or module may downsample elements of a training example to a desired number of fewer data elements. As a non-limiting example, a high-pixel-count image may have 256 pixels, but the desired number of pixels may be 128. The processor can downsample the high-pixel-count image to convert 256 pixels to 128 pixels. In some embodiments, the processor may be configured to perform downsampling on the data. Downsampling, also known as decimation, may involve a process known as “compression,” which removes every Nth entry in a sequence of samples, all but the every Nth entry, etc., and can be performed, for example, by an N-sample compressor implemented using hardware or software. Anti-aliasing filters and / or anti-imaging filters and / or low-pass filters may be used to remove the side effects of compression.
[0076] Referring further to Figure 8, feature selection includes narrowing and / or filtering the training data to exclude features and / or elements, or training data containing such elements, and / or sets of features and / or elements, or training data containing such elements, based on their relevance or usefulness to the intended task or purpose of the trained machine learning model and / or algorithm. Feature selection can be carried out using any process described in this disclosure, including, but not limited to, the use of a training data classifier, the exclusion of outliers, etc.
[0077] Continuing to refer to Figure 8, feature scaling may include, but is not limited to, normalization of data entries, which can be achieved by dividing a numerical field by its norm, as is done for vector normalization, for example. Feature scaling may include absolute maximum scaling, where each quantitative data is divided by the largest absolute value of all quantitative data in the set or subset of quantitative data. Feature scaling may include minimum-maximum scaling, where each value X is subtracted from it by the minimum value Xmin in the set or subset of values, and the result is divided by a range of values to give the maximum value Xmax in the set or subset:
number
number
number
number
[0078] Referring further to Figure 8, the machine learning module 800 may be configured to perform a delayed learning process 820 and / or protocol, which may alternatively be called a “delayed loading” or “invoke on demand” process and / or protocol, and may be a process in which machine learning is performed by receiving an input which will be converted into an output, and then combining this input and training set to derive an algorithm which will be used to generate the output on demand. For example, an initial set of simulations may be performed to cover an initial heuristic and / or “first guess” in the output and / or relationships. As a non-limiting example, the initial heuristic may include ranking the associations between the input and elements of the training data 804. The heuristic may include selecting some of the highest-ranking associations and / or elements of the training data 804. Delayed learning can implement any suitable delayed learning algorithm, including, but not limited to, the K-nearest neighbor algorithm, the delayed Naive Bayes algorithm, etc., and those skilled in the art will recognize, upon considering the entirety of this disclosure, a variety of delayed learning algorithms that can be applied to produce the output described herein, including, but not limited to, delayed learning applications of machine learning algorithms as described below in further detail.
[0079] Alternatively or additionally, continuing to refer to Figure 8, a machine learning model 824 can be generated using a machine learning process such as those described in this disclosure. “Machine learning model” is, as used in this disclosure, a data structure that represents and / or instantiates a mathematical and / or algorithmic representation of a relationship between inputs and outputs, such that it is generated and stored in memory using any machine learning process, including, but not limited to, any of the processes described above, and the inputs are submitted to a previously created machine learning model 824 that generates outputs based on the derived relationships. For example, a linear regression model generated using a linear regression algorithm may compute a linear combination of input data using coefficients derived during the machine learning process to compute output data. As a further non-limiting example, a machine learning model 824 may be generated by creating an artificial neural network, such as a convolutional neural network, which includes an input layer of nodes, one or more hidden layers, and an output layer of nodes. The connections between nodes can be created through a process of "training" the network, where elements from a set of 804 training data are applied to the input nodes. Then, using an appropriate training algorithm (such as the Levenberg-Marquardt method, conjugate gradient, simulated annealing, or other algorithms), the connections and weights between nodes in adjacent layers of the neural network are adjusted to produce the desired values in the output nodes. This process is sometimes called deep learning.
[0080] Referring further to Figure 8, the machine learning algorithm may include at least one supervised machine learning process 828. At least one supervised machine learning process 828 as defined herein includes an algorithm that receives a training set relating some inputs to some outputs and attempts to generate one or more data structures that represent and / or instantiate one or more mathematical relationships relating the inputs to the outputs, each of which is optimal for some criteria specified for the algorithm using some scoring function. For example, the supervised learning algorithm may include at least one slide as described above as an input, and properties such as texture classification, hue classification, etc., as at least one slide as an output, and a scoring function that represents a desired form of relationship to be detected between the inputs and outputs, the scoring function may, for example, attempt to maximize the probability that a given input and / or combination input of elements is associated with a given output and minimize the probability that a given input is not associated with a given output. The scoring function can be expressed as a risk function representing the “expected loss” of an algorithm relating inputs to outputs, where the loss is calculated as an error function representing the degree to which the predictions generated by the relationship are inaccurate when compared to a given input-output pair provided to the training data 804. Those skilled in the art will recognize, upon considering the entirety of this disclosure, various possible variations of at least one supervised machine learning process 828 that can be used to determine the relationship between inputs and outputs. The supervised machine learning process may include a classification algorithm such as the one defined above.
[0081] Referring further to Figure 8, training a supervised machine learning process may include, but is not limited to, iteratively updating coefficients, biases, and weights based on an error function, expected loss, and / or risk function. For example, the output generated by a supervised machine learning model using input examples in training examples may be compared with output examples from training examples, and an error function may be generated based on the comparison, which may include any error function suitable for use in any machine learning algorithm described herein, such as the square of the difference between one or more sets of comparison values. Such an error function may further be used to update one or more weights, biases, coefficients, or other parameters of the machine learning model via any suitable process, which may include, but is not limited to, a gradient descent process, a least-squares process, and / or other processes described herein. This may be done iteratively and / or recursively to gradually adjust such weights, biases, coefficients, or other parameters. The updates may be performed in a neural network using one or more backpropagation algorithms. The iterative and / or recursive updates of weights, biases, coefficients, or other parameters described above can be performed until the currently available training data is exhausted and / or until a convergence test is passed. A “convergence test” is a test of a condition selected to indicate that the model and / or its weights, biases, coefficients, or other parameters have reached a certain degree of accuracy. In a convergence test, for example, the difference between two or more consecutive error or error function values can be compared, where a difference below a threshold amount may be interpreted as indicating convergence. Alternatively or additionally, one or more error and / or error function values evaluated in training iterations can be compared to a threshold.
[0082] Referring further to Figure 8, computing devices, processors, and / or modules may be configured to execute methods, process steps, sequences of process steps, and / or algorithms described with reference to this figure in any order and at any degree of iteration. For example, computing devices, processors, and / or modules may be configured to repeatedly execute a single process, sequence, and / or algorithm until a desired or instructed result is achieved, and the iteration of a process or sequence of process steps is performed iteratively and / or recursively using the output of the previous iteration as input to the subsequent iteration, aggregating the inputs and / or outputs of the iterations to produce an aggregated result, decreasing or phasing out one or more variables such as global variables, and / or dividing a larger processing task into a set of smaller processing tasks that are iteratively dealt with. Computing devices, processors, and / or modules may execute any process, sequence of process, or algorithm in parallel, such as executing a process simultaneously and / or substantially simultaneously two or more times using two or more parallel threads, processor cores, etc., and the task division between parallel threads and / or processes may be performed according to any protocol suitable for task division between iterations. Those skilled in the art, upon reviewing the entirety of this disclosure, will recognize a variety of ways in which processes, sequences of processes, processing tasks, and / or data can be subdivided, shared, or otherwise processed using iterative, recursive, and / or parallel processing.
[0083] Referring further to Figure 8, the machine learning process may include at least one unsupervised machine learning process 832. When used herein, an unsupervised machine learning process is a process that derives inferences within a dataset regardless of labels, and as a result, the unsupervised machine learning process is free to discover any structures, relationships, and / or correlations provided in the data. The unsupervised process 832 may not require a response variable, and can be used to find interesting patterns and / or inferences between variables, determine the degree of correlation between two or more variables, and so on.
[0084] Referring further to Figure 8, the machine learning module 800 may be designed and configured to create a machine learning model 824 using techniques for developing linear regression models. A linear regression model may include a standard least-squares regression, which aims to minimize the square of the difference between the predicted and actual results according to a suitable norm (e.g., the vector space distance norm) for measuring such a difference, and the coefficients of the resulting linear equation can be modified to improve the minimization. A linear regression model may include a ridge regression method, where the function to be minimized includes a least-squares function and a term that multiplies the square of each coefficient by a scalar quantity to penalize large coefficients. A linear regression model may include a least absolute shrinkage and selection operator (LASSO) model, where ridge regression is combined with multiplying the least-squares term by a coefficient obtained by dividing 1 by twice the number of samples. A linear regression model may include a multitask LASSO model, where the norm applied to the least-squares term of the LASSO model is the Frobenius norm, which corresponds to the square root of the sum of the squares of all terms. Linear regression models may include elastic net models, multitask elastic net models, minimum angle regression models, LARS lasso models, orthogonal matching tracking models, Bayesian regression models, logistic regression models, stochastic gradient descent models, perceptron models, passive attack algorithms, robust regression models, Huber regression models, or any other suitable models that a person skilled in the art may conceive of by considering the entirety of this disclosure. In one embodiment, the linear regression model can be generalized to a polynomial regression model, thereby finding a polynomial (e.g., a quadratic, cubic, or higher-order equation) that provides the best predictive output / actual output fit. As will be apparent to a person skilled in the art by considering the entirety of this disclosure, similar methods as described above can be applied to minimize the error function.
[0085] Continuing to refer to Figure 8, machine learning algorithms may include, but are not limited to, linear discriminant analysis. Machine learning algorithms may include quadratic discriminant analysis. Machine learning algorithms may include kernel ridge regression. Machine learning algorithms may include, but are not limited to, support vector machines, including support vector classification-based regression processes. Machine learning algorithms may include stochastic gradient descent algorithms, including classification and regression algorithms based on stochastic gradient descent. Machine learning algorithms may include nearest neighbor algorithms. Machine learning algorithms may include various forms of latent space regularization, such as variational regularization. Machine learning algorithms may include Gaussian processes, such as Gaussian process regression. Machine learning algorithms may include cross-decomposition algorithms, including partial least squares and / or canonical correlation analysis. Machine learning algorithms may include naive Bayes methods. Machine learning algorithms may include decision tree-based algorithms, such as decision tree classification or regression algorithms. Machine learning algorithms may include ensemble methods, such as bagging meta-estimators, random tree forests, AdaBoost, gradient tree boosting, and / or voting classifier methods. Machine learning algorithms may include neural network algorithms, including convolutional neural network processes.
[0086] Referring further to Figure 8, machine learning models and / or processes can be deployed or instantiated by being incorporated into programs, devices, systems, and / or modules. For example, but not limited to, machine learning models, neural networks, and / or some or all of their parameters can be stored and / or deployed in any memory or circuit. Parameters such as coefficients, weights, and / or biases can be stored as circuit-based constants such as arrays of wires set to logical "1" and "0" voltage levels in a logic circuit and / or binary inputs and / or outputs to represent numbers in any suitable encoding system, including two's complement, or they can be stored in any volatile and / or non-volatile memory. Similarly, mathematical operations and inputs and / or outputs of data to and from models, neural network layers, etc., can be instantiated in the form of machine code such as instructions, binary arithmetic code instructions, assembly language, or any higher-order programming language in hardware circuits and / or firmware. Machine learning processes and / or models can be instantiated using any technology for hardware and / or software instantiation of memory, instructions, data structures, and / or algorithms, which includes, but are not limited to, the manufacture and / or configuration of non-reconfigurable hardware elements, circuits, and / or modules such as ASICs, but are not limited to, the manufacture and / or configuration of reconfigurable hardware elements, circuits, and / or modules such as FPGAs, but are not limited to, the manufacture and / or configuration of non-reconfigurable and / or configuration non-rewritable memory elements, circuits, and / or modules such as non-rewritable ROMs, but are not limited to, the manufacture and / or configuration of reconfigurable and / or rewritable memory elements, circuits, and / or modules such as rewritable ROMs or other memory technologies described herein, and / or any combination of the manufacture and / or configuration of any computing devices and / or components described herein.Such deployments and / or instantiated machine learning models and / or algorithms can receive input from any other processes, modules, and / or components described in this disclosure and generate outputs to any other processes, modules, and / or components described in this disclosure.
[0087] Continuing to refer to Figure 8, any process of training, retraining, deploying, and / or instantiating any machine learning model and / or algorithm can be performed and / or repeated after initial deployment and / or instantiation to correct, refine, and / or improve the machine learning model and / or algorithm. Such retraining, deployment, and / or instantiation may be performed as a periodic or regular process, such as retraining, deployment, and / or instantiation in a periodic elapsed period, after some measure of quantity, such as the number of bytes of data processed or other measures, the number of uses or executions of the processes described herein, and / or according to a software, firmware, or other update schedule. Alternatively or additionally, retraining, deployment, and / or instantiation may be event-based and / or triggered by user input exhibiting suboptimal or other problematic performance, and / or by automated field testing and / or audit processes, which may compare the output of the machine learning model and / or algorithm, and / or its error and / or error function, to any threshold, convergence test, etc., and / or the output of the processes described herein to similar threshold, convergence test, etc. Event-based retraining, deployment, and / or instantiation may, alternatively or additionally, be triggered by the receipt and / or generation of one or more new training examples, where some new training examples can be compared to a pre-configured threshold, and if the threshold is exceeded, retraining, deployment, and / or instantiation can be triggered.
[0088] Referring further to Figure 8, retraining and / or additional training can be performed using any currently or previously deployed version of the machine learning model and / or algorithm as a starting point, using any process for training described above. Training data for retraining can be collected, pre-conditioned, sorted, classified, sanitized, or otherwise processed according to any process described herein. Training data may include, but are not limited to, training examples including inputs and correlated outputs used, received, and / or generated from any version of any system, module, machine learning model or algorithm, apparatus, and / or method described herein, such examples may be modified and / or labeled according to user feedback or other processes to show desired results, and / or may have actual or measured results from processes modeled and / or predicted by the system, module, machine learning model or algorithm, apparatus, and / or method as “desired” results to be compared with the outputs for the training process as described above.
[0089] Redeployment can be performed by any reconfiguration and / or rewriting of reconfigurable and / or rewritable circuit and / or memory elements, or by generating new hardware and / or software components, circuits, instructions, etc., which may be added to and / or replace existing hardware and / or software components, circuits, instructions, etc. Referring further to Figure 8, one or more of the processes or algorithms described above can be performed by at least one dedicated hardware unit 836. For the purposes of this figure, “dedicated hardware unit” is a hardware component, circuit, etc., separate from the main control circuit and / or processor that performs the steps of the method described herein, which is specifically designated or selected to perform one or more particular tasks and / or processes described with reference to this figure, such as preconditioning and / or sanitizing training data, and / or training machine learning algorithms and / or models. Dedicated hardware unit 836 may include, but is not limited to, a hardware unit that can perform iterative or centralized computations, such as matrix-based computations for updating or adjusting parameters, weights, coefficients, and / or biases of machine learning models and / or neural networks, using pipelined, parallel processing, etc., efficiently, and such a hardware unit may be optimized for such processes by including dedicated circuitry for matrix and / or signal processing operations, including, for example, multiple arithmetic and / or logic circuit units, such as multipliers and / or adders, which can operate simultaneously and / or in parallel. Such dedicated hardware units 836 may include, but are not limited to, a graphical processing unit (GPU), a dedicated signal processing module, an FPGA, or other reconfigurable hardware configured to instantiate parallel processing units for one or more specific tasks. A computing device, processor, apparatus, or module may be configured to instruct one or more dedicated hardware units 836 to perform one or more operations described herein, such as evaluating model and / or algorithm outputs, one-time or iterative updates of parameters, coefficients, weights, and / or biases, and / or any other operations such as vector and / or matrix operations as described herein.
[0090] Referring here to Figure 9, an exemplary embodiment of neural network 900 is shown. Also known as an artificial neural network, neural network 900 is a network of “nodes,” or a data structure having one or more inputs, one or more outputs, and a function that determines the output based on the input. Such nodes can be organized into a network, such as a convolutional neural network, which includes, but is not limited to, an input layer of node 904, one or more hidden layers 908, and an output layer of node 912. Connections between nodes can be created through a process of “training” the network, where elements from a training dataset are applied to the input nodes, and then, using a suitable training algorithm (such as the Levenberg-Marquardt method, conjugate gradient, simulated annealing, or other algorithms), the connections and weights between nodes in adjacent layers of the neural network are adjusted to produce desired values in the output nodes. This process is sometimes called deep learning. Connections may flow only from input nodes to output nodes in a “feedforward” network, or the output of one layer may be fed back to the input of the same or different layers in a “recurrent network.” As a further non-limiting example, a neural network may include a convolutional neural network having an input layer, one or more hidden layers, and an output layer. When used in this disclosure, a “convolutional neural network” is a neural network in which at least one hidden layer is a convolutional layer that convolves the input to that layer with a subset of inputs known as a “kernel,” along with one or more additional layers such as a pooling layer or a fully connected layer.
[0091] Referring now to Figure 10, an exemplary embodiment of a neural network node 1000 is shown. A node can receive multiple inputs x from inputs to the neural network containing the node and / or from other nodes, but is not limited to the node. iThis may include: A node can compute a binary step function that compares an input to a threshold and outputs either a logical 1 or logical 0 output or equivalent; execute one or more activation functions that, given one or more inputs, produce an output, such as a linear activation function whose output is directly proportional to the input, and / or a nonlinear activation function whose output is not proportional to the input. A nonlinear activation function, given an input x, can, in a formal way,
number
number
number
number
number
number
[0092] Note that any one or more of the aspects and embodiments described herein can be advantageously implemented using one or more machines (e.g., one or more computing devices utilized as user computing devices for electronic documents, one or more server devices such as document servers, etc.) programmed in accordance with the teachings of this specification, as will be apparent to those skilled in the computer art. As will be apparent to those skilled in the software art, appropriate software coding can be readily created by a skilled programmer based on the teachings of this disclosure. The above aspects and embodiments using software and / or software modules may also include appropriate hardware for supporting the implementation of machine-executable instructions of the software and / or software modules.
[0093] Such software may be a computer program product that uses a machine-readable storage medium. A machine-readable storage medium may be any medium capable of storing and / or encoding a sequence of instructions for execution by a machine (e.g., a computing device) and causing a machine to execute any one of the methods and / or embodiments described herein. Examples of machine-readable storage mediums include, but are not limited to, magnetic disks, optical disks (e.g., CDs, CD-Rs, DVDs, DVD-Rs, etc.), magneto-optical disks, read-only memory "ROM" devices, random access memory "RAM" devices, magnetic cards, optical cards, solid-state memory devices, EPROMs, EEPROMs, and any combination thereof. As used herein, a machine-readable medium is intended to include a single medium, as well as a collection of physically separate media, such as a compact disk in combination with computer memory or a collection of one or more hard disk drives. As used herein, a machine-readable storage medium does not involve the transmission of signals in a transient form.
[0094] Such software may also include information (e.g., data) carried as data signals on a data carrier, such as a carrier wave. For example, machine-executable information may be included as data-carrying signals embodied on a data carrier, where the signals encode a sequence or portion thereof of instructions for execution by a machine (e.g., a computing device), and any relevant information (e.g., data structures and data) that causes the machine to execute any one of the methods and / or embodiments described herein.
[0095] Examples of computing devices include, but are not limited to, e-book readers, computer workstations, terminal computers, server computers, handheld devices (e.g., tablet computers, smartphones, etc.), web appliances, network routers, network switches, network bridges, any machine capable of executing a sequence of instructions specifying the actions it should take, and any combination thereof. For example, a computing device may include and / or be included in a kiosk.
[0096] Figure 11 shows a graphical representation of one embodiment of a computing device in exemplary form of a computer system 1100, in which a set of instructions for causing a control system to execute any one or more aspects and / or methodologies of the present disclosure can be executed within it. It is also conceivable that multiple computing devices could be used to implement a specially configured set of instructions for causing one or more of the devices to execute any one or more aspects and / or methodologies of the present disclosure. The computer system 1100 includes a processor 1104 and memory 1108 that communicate with each other and with other components via a bus 1112. The bus 1112 may include any of several types of bus structures, including but not limited to a memory bus, a memory controller, a peripheral bus, a local bus, and any combination thereof, using any of various bus architectures.
[0097] The processor 1104 may include, but is not limited to, any suitable processor, such as a processor incorporating logic circuits for performing arithmetic and logic operations, such as an arithmetic and logic unit (ALU), which may be controlled by a state machine and directed by operational inputs from memory and / or sensors, and the processor 1104 may, as an unspecified example, be organized according to the von Neumann architecture and / or the Harvard architecture. The processor 1104 may include, but is not limited to, microcontrollers, microprocessors, digital signal processors (DSPs), field programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), graphical processing units (GPUs), general-purpose GPUs, tensor processing units (TPUs), analog or mixed-signal processors, trusted platform modules (TPMs), floating-point units (FPUs), system-on-modules (SOMs), and / or system-on-chip (SoCs), and / or be incorporated into them.
[0098] Memory 1108 may include a variety of components (e.g., machine-readable media) including, but not limited to, random-access memory components, read-only components, and any combination thereof. For example, a basic input / output system (BIOS) 1116, which includes basic routines useful for transferring information between elements within a computer system 1100 during startup, may be stored in memory 1108. Memory 1108 may also include instructions (e.g., software) 1120 (e.g., stored in one or more machine-readable media) that embody any one or more aspects and / or methodologies of this disclosure. In another example, memory 1108 may further include any number of program modules, including, but not limited to, an operating system, one or more application programs, other program modules, program data, and any combination thereof.
[0099] The computer system 1100 may also include a storage device 1124. Examples of storage devices (e.g., storage device 1124) include, but are not limited to, hard disk drives, magnetic disk drives, optical disk drives combined with optical media, solid-state memory devices, and any combination thereof. The storage device 1124 may be connected to the bus 1112 by a suitable interface (not shown). Exemplary interfaces include, but are not limited to, SCSI, advanced technology attachment (ATA), serial ATA, universal serial bus (USB), IEEE 1394 (FIREWIRE®), and any combination thereof. In one example, the storage device 1124 (or one or more of its components) may be detachably interfaced with the computer system 1100 (e.g., via an external port connector (not shown)). In particular, the storage device 1124 and associated machine-readable media 1128 can provide non-volatile and / or volatile storage for machine-readable instructions, data structures, program modules, and / or other data for the computer system 1100. In one example, the software 1120 may reside entirely or partially within the machine-readable media 1128. In another example, the software 1120 may reside entirely or partially within the processor 1104.
[0100] The computer system 1100 may also include an input device 1132. In one example, a user of the computer system 1100 may input commands and / or other information to the computer system 1100 via the input device 1132. Examples of input devices 1132 include, but are not limited to, alphanumeric input devices (e.g., keyboards), pointing devices, joysticks, gamepads, audio input devices (e.g., microphones, voice response systems, etc.), cursor control devices (e.g., mice), touchpads, optical scanners, video capture devices (e.g., still cameras, video cameras), touchscreens, and any combination thereof. The input device 1132 can interface to the bus 1112 via any of a variety of interfaces (not shown) including, but not limited to, serial interfaces, parallel interfaces, game ports, USB interfaces, FIREWIRE® interfaces, direct interfaces to the bus 1112, and any combination thereof. The input device 1132 may include a touchscreen interface, which may be part of or separate from the display 1136, as will be further described below. As described above, the input device 1132 can be used as a user selection device for selecting one or more graphical representations within the graphical interface.
[0101] The user can also input commands and / or other information into the computer system 1100 via a storage device 1124 (e.g., a removable disk drive, flash drive, etc.) and / or a network interface device 1140. Network interface devices such as network interface device 1140 can be used to connect the computer system 1100 to one or more of various networks, such as network 1144, and one or more remote devices 1148 connected to it. Examples of network interface devices include, but are not limited to, network interface cards (e.g., mobile network interface cards, LAN cards), modems, and any combination thereof. Examples of networks include, but are not limited to, wide area networks (e.g., the Internet, corporate networks), local area networks (e.g., networks associated with offices, buildings, campuses, or other relatively small geographical spaces), telephone networks, data networks associated with telephone / voice providers (e.g., mobile communications provider data and / or voice networks), direct connections between two computing devices, and any combination thereof. Networks such as network 1144 may use wired and / or wireless communication modes. In general, any network topology may be used. Information (e.g., data, software 1120, etc.) can be communicated to and from the computer system 1100 via the network interface device 1140.
[0102] The computer system 1100 may further include a video display adapter 1152 for communicating displayable images to a display device such as a display device 1136. Examples of display devices include, but are not limited to, liquid crystal displays (LCDs), cathode ray tubes (CRTs), plasma displays, light-emitting diode (LED) displays, and any combination thereof. The display adapter 1152 and the display device 1136 may be used in combination with a processor 1104 to provide a graphical representation of an aspect of the disclosure. In addition to the display devices, the computer system 1100 may include one or more other peripheral output devices, including, but not limited to, audio speakers, printers, and any combination thereof. Such peripheral output devices may be connected to the bus 1112 via a peripheral interface 1156. Examples of peripheral interfaces include, but are not limited to, serial ports, USB connections, FireWire® connections, parallel connections, and any combination thereof.
[0103] The above has been a detailed description of exemplary embodiments of the present invention. Various modifications and additions can be made without departing from the spirit and scope of the invention. Each feature of the various embodiments described above can be combined as necessary with features of other described embodiments to provide a number of feature combinations in the relevant new embodiments. Furthermore, although several distinct embodiments have been described above, those described herein are merely illustrative of the application of the principles of the present invention. Furthermore, certain methods herein may be illustrated and / or described as being performed in a particular order, but the order is highly variable in the ordinary art to achieve the methods, systems, and software according to this disclosure. Therefore, this description is intended to be construed as illustrative only and does not limit the scope of the invention.
[0104] Exemplary embodiments are disclosed above and shown in the accompanying drawings. Those skilled in the art will understand that various modifications, omissions, and additions can be made to those specifically disclosed herein without departing from the spirit and scope of the invention.
[0105] (Note 1) A device for automatically verifying quality data associated with at least one slide, wherein the device is At least one optical device comprising at least one camera configured to scan at least one slide, wherein the at least one slide comprises two or more stained cores, and At least one computing device that is communicably connected to the at least one optical device, the computing device is Memory and At least one processor connected to the memory in a communicative manner, A computing device comprising: Equipped with, The aforementioned memory, Receiving at least one digital slide corresponding to the at least one slide from the at least one optical device, Metadata is used to determine slide identification, The method involves locating the at least one digital slide using at least one object detection technique, wherein locating the at least one digital slide is Identifying the stained core of the target, and, Aligning the stained cores of the aforementioned target into a standard format, This includes location identification, Evaluating the stained core of the subject using a predetermined threshold, wherein the evaluation is performed To compare the stained core of the subject with a control core, The verification output is generated as a function of comparison between the stained core of the target and the control core. To transmit the verification output to the display device, The instructions include instructions that configure the at least one processor to perform the following: Device. (Note 2) The apparatus according to Appendix 1, wherein at least one slide includes an immunohistochemistry slide, and the two or more stained cores of the immunohistochemistry slide include a tissue sample. (Note 3) Determining the slide identification is Extracting the metadata from the slide labels using an image processing algorithm, The apparatus described in Appendix 1, including the apparatus described in Appendix 1. (Note 4) The apparatus according to Appendix 1, wherein determining the slide identification using the metadata includes extracting the slide identification from an electronic health record database. (Note 5) Generating the verification output involves utilizing a Stein statistical model configured to calculate at least one predetermined value, The at least one predetermined value is configured to verify the at least one digital slide. The apparatus described in Appendix 1. (Note 6) The apparatus according to Appendix 1, wherein aligning the stained cores of the target includes one or more of the following: rotating the at least one digital slide, translating the at least one digital slide, and reflecting the at least one digital slide. (Note 7) The aforementioned device Using metadata to identify the stained core of the target, Use at least one handling mirror, The stained core of the subject is used to create a comparison control. The apparatus according to Appendix 1, further configured to align the two or more stained cores of an object, including a mirror image configuration, into the standard form. (Note 8) The apparatus according to Appendix 7, wherein aligning the two or more stained cores, including the mirror image configuration, into the standard form includes using the textures, deconvoluted channels, and at least one stain of the two or more stained cores. (Note 9) The time data associated with the at least one slide indicates which of the multiple preprocessing steps generated the failure associated with the at least one slide. The time data associated with at least one of the aforementioned slides is organized in a time series, The plurality of preprocessing steps corresponding to the aforementioned time data are organized in a time series, Using the aforementioned time data, one of the multiple pre-processing steps corresponding to the failure is identified. The apparatus described in Appendix 1, used to determine by (Note 10) The verification output is further configured to cross-validate the positive control core, and the cross-validation is performed One or more optical device parameters of the optical device are configured as a function of the stained core of the target and the positive control core, Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, The stained core of the subject is compared with the control core, To generate cross-validation output and The apparatus described in Appendix 1, including the apparatus described in Appendix 1. (Note 11) A method for automatically verifying quality data associated with at least one slide, wherein the method is: Receiving at least one digital slide corresponding to at least one slide from at least one optical device, wherein the at least one slide includes two or more stained cores; Metadata is used to determine slide identification, The method involves locating the at least one digital slide using at least one object detection technique, wherein locating the at least one digital slide is Identifying the stained core of the target, and, Aligning the stained cores of the aforementioned target into a standard format, This includes location identification, Evaluating the stained core of the subject using a predetermined threshold, the evaluation is performed by To compare the stained core of the subject with a control core, The verification output is generated as a function of comparison between the stained core of the target and the control core. The verification output is transmitted to the downstream device, and For further operation, the cross-verification output is transmitted to the display device. This includes evaluating, Methods that include... (Note 12) The method according to Appendix 11, wherein at least one slide comprises an immunohistochemistry slide, and the two or more stained cores of the immunohistochemistry slide comprise a tissue sample. (Note 13) Determining the slide identification is Extracting the metadata from the slide labels using an image processing algorithm, The method described in Appendix 11, including the method described in Appendix 11. (Note 14) The method according to Appendix 11, wherein determining the slide identification using the metadata includes extracting the slide identification from an electronic health record database. (Note 15) Generating the verification output involves utilizing a Stein statistical model configured to calculate at least one predetermined value, The at least one predetermined value is configured to verify the at least one digital slide. The method described in Appendix 11. (Note 16) The method according to Appendix 11, wherein aligning the stained cores of the target includes one or more of the following: rotating the at least one digital slide, translating the at least one digital slide, and reflecting the at least one digital slide. (Note 17) Using metadata to identify the stained core of the target, Use at least one handling mirror, The aforementioned stained cores are superimposed on a verification control, The method according to Appendix 11, further configured to align the two or more stained cores of an object, including a mirror image configuration, into the standard form. (Note 18) The method according to Appendix 17, wherein aligning the two or more stained cores, including the mirror image configuration, into the standard format involves using the textures, deconvoluted channels, and at least one stain of the two or more stained cores. (Note 19) The time data associated with the at least one slide indicates which of the multiple preprocessing steps generated the failure associated with the at least one slide. The time data associated with at least one of the aforementioned slides is organized in a time series, The plurality of preprocessing steps corresponding to the aforementioned time data are organized in a time series, Using the aforementioned time data, one of the multiple pre-processing steps corresponding to the failure is identified. The method described in Appendix 11, used to determine by (Note 20) The verification output is further configured to cross-validate the positive control core, and the cross-validation is performed One or more optical device parameters of the optical device are configured as a function of the stained core of the target and the positive control core, Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, The stained core of the subject is compared with the control core, To generate cross-validation output and The method described in Appendix 11, including the method described in Appendix 11.
Claims
1. A device for automatically verifying quality data associated with at least one slide, wherein the device is At least one optical device comprising at least one camera configured to scan at least one slide, wherein the at least one slide comprises two or more stained cores, and scanning the at least one slide creates at least one digital slide corresponding to the at least one slide, and At least one computing device that is communicably connected to the at least one optical device, the computing device is Memory and At least one processor connected to the memory in a communicative manner, A computing device comprising: Equipped with, The aforementioned memory, Receiving at least one digital slide corresponding to the at least one slide from the at least one optical device, Metadata is used to determine slide identification, The method involves locating the at least one digital slide using at least one object detection technique, wherein locating the at least one digital slide is Identifying the target stained core from the two or more stained cores, and Aligning the stained cores of the subject to a standard format, wherein the standard format includes at least data entry rules. This includes location identification, Evaluating the stained core of the subject using a predetermined threshold, wherein the evaluation is performed To compare the stained core of the subject with a control core, The process involves generating a validation output as a function of comparison between the stained core of the subject and the control core, wherein generating the validation output includes utilizing a stain statistical model configured to calculate at least one predetermined value, the at least one predetermined value being configured to validate the at least one digital slide. To transmit the verification output to the display device, The reliability of the verification output is confirmed by cross-validating the positive control core of the verification output, wherein the cross-validation is performed Configuring one or more optical device parameters, including at least the magnification level and scanning speed of the optical device, as a function of the stained core of the target and the positive control core, such that the optical device parameters are adjusted to ensure precise specific attributes of the stained core. Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, Comparing the stained core of the subject with the control core, To generate cross-validation output, This includes cross-validation, and For further operation, the cross-verification output is transmitted to the display device. Including evaluation, The instructions include instructions that configure the at least one processor to perform the following: Device.
2. The apparatus according to claim 1, wherein the at least one slide includes an immunohistochemistry slide, and the two or more stained cores of the immunohistochemistry slide include a tissue sample.
3. Determining the slide identification is Extracting the metadata from the slide labels using an image processing algorithm, The apparatus according to claim 1, including the following:
4. The apparatus according to claim 1, wherein determining the slide identification using the metadata includes extracting the slide identification from an electronic health record database.
5. The apparatus according to claim 1, wherein aligning the stained cores of the target includes one or more of the following: rotating the at least one digital slide, translating the at least one digital slide, and reflecting the at least one digital slide.
6. The aforementioned device Using metadata to identify the stained core of the target, Use at least one handling mirror, The aforementioned stained cores are superimposed on a verification control, The apparatus according to claim 1, further configured to align the stained cores of the object, including the mirror image configuration, into the standard form.
7. The apparatus according to claim 6, wherein aligning the two or more stained cores, including the mirror image configuration, into the standard form comprises using the texture, deconvoluted channels, and at least one stain of the two or more stained cores.
8. The time data associated with the at least one slide indicates which of the multiple preprocessing steps generated the failure associated with the at least one slide. The time data associated with at least one of the aforementioned slides is organized in chronological order, The plurality of preprocessing steps corresponding to the aforementioned time data are organized in a time series, Using the aforementioned time data, one of the multiple pre-processing steps corresponding to the failure is identified. The apparatus according to claim 1, used for determining by
9. The verification output is further configured to cross-validate the positive control core, and the cross-validation is performed One or more optical device parameters of the optical device are configured as a function of the stained core of the target and the positive control core, Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, The stained core of the subject is compared with the control core, To generate cross-validation output, The apparatus according to claim 1, including the following:
10. A method for automatically verifying quality data associated with at least one slide, wherein the method is: Receiving at least one digital slide corresponding to at least one slide from at least one optical device, wherein the at least one slide includes two or more stained cores; Metadata is used to determine slide identification, The method involves locating the at least one digital slide using at least one object detection technique, wherein locating the at least one digital slide is Identifying the target stained core from the two or more stained cores, and Aligning the stained cores of the aforementioned target into a standard format, This includes location identification, Evaluating the stained core of the subject using a predetermined threshold, the evaluation is performed by To compare the stained core of the subject with a control core, The process involves generating a validation output as a function of comparison between the stained core of the subject and the control core, wherein generating the validation output includes utilizing a stain statistical model configured to calculate at least one predetermined value, the at least one predetermined value being configured to validate the at least one digital slide. The verification output is transmitted to the downstream device. The reliability of the verification output is confirmed by cross-validating the positive control core of the verification output, wherein the cross-validation is performed Configuring one or more optical device parameters, including at least the magnification level and scanning speed of the optical device, as a function of the stained core of the target and the positive control core, such that the optical device parameters are adjusted to ensure precise specific attributes of the stained core. Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, Comparing the stained core of the subject with the control core, To generate cross-validation output, This includes cross-validation, and For further operation, the cross-verification output is transmitted to the display device. This includes evaluating, Methods that include...
11. The method according to claim 10, wherein the at least one slide comprises an immunohistochemistry slide, and the two or more stained cores of the immunohistochemistry slide comprise a tissue sample.
12. Determining the slide identification is Extracting the metadata from the slide labels using an image processing algorithm. The method according to claim 10, including the method described in claim 10.
13. The method according to claim 10, wherein determining the slide identification using the metadata includes extracting the slide identification from an electronic health record database.
14. The method according to claim 10, wherein aligning the stained cores of the target includes one or more of rotating the at least one digital slide, translating the at least one digital slide, and reflecting the at least one digital slide.
15. Using metadata to identify the stained core of the target, Use at least one handling mirror, The aforementioned stained cores are superimposed on a verification control, The method according to claim 10, further configured to align the stained cores of the object, including the mirror image configuration, into the standard form.
16. The method according to claim 15, wherein aligning the two or more stained cores, including the mirror image configuration, into the standard form involves using the texture, deconvoluted channels, and at least one stain of the two or more stained cores.
17. The time data associated with the at least one slide indicates which of the multiple preprocessing steps generated the failure associated with the at least one slide. The time data associated with at least one of the aforementioned slides is organized in chronological order, The plurality of preprocessing steps corresponding to the aforementioned time data are organized in a time series, Using the aforementioned time data, one of the multiple pre-processing steps corresponding to the failure is identified. The method according to claim 10, used to determine by
18. The verification output is further configured to cross-validate the positive control core, and the cross-validation is performed One or more optical device parameters of the optical device are configured as a function of the stained core of the target and the positive control core, Scanning the stained core of the target at a certain magnified resolution using at least one of the optical devices, The stained core of the subject is compared with the control core, To generate cross-validation output, The method according to claim 10, including the method described in claim 10.