Systems and methods of automatic material verification for manufacturing

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
An automated verification system using imaging devices and machine learning models addresses inefficiencies in construction by accurately identifying and verifying construction materials, reducing human error and expediting housing construction.

WO2026129049A1PCT designated stage Publication Date: 2026-06-25PROMISE ROBOTICS INC

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: PROMISE ROBOTICS INC
Filing Date: 2025-12-18
Publication Date: 2026-06-25

Smart Images

Figure CA2025051719_25062026_PF_FP_ABST

Patent Text Reader

Abstract

Methods and systems for the automated verification of a plurality of workpieces is disclosed. The method comprises acquiring, at one or more imaging devices, one or more input images of the plurality of workpieces at the predefined area; segmenting at least one input image based on a semantic segmentation model; identifying one or more identified workpiece instances based on an instance segmentation model; generating at least one corresponding instance segmented image; determining, based on an edge detection model, one or more edges for the one or more identified workpiece instances; mapping each of the one or more edges and the one or more corners to one or more sets of coordinates on a workpiece plane, determining one or more measured characteristics of each workpiece instance; verifying, for each workpiece instance, the one or more measured characteristics of the workpiece instance, producing a verification indication.

Need to check novelty before this filing date? Find Prior Art

Description

TITLE: SYSTEMS AND METHODS OF AUTOMATIC MATERIAL VERIFICATION FOR MANUFACTURINGFIELD

[0001] The present disclosure generally relates to assembly and manufacturing of building structures, including building structures used in the assembly of housing units as well as other infrastructure, and in particular, to methods, systems and devices for automated verification of materials used in such manufactures.INTRODUCTION

[0002] The following is not an admission that anything discussed below is part of the prior art or part of the common general knowledge of a person skilled in the art.

[0003] In recent years, many urban centers have experienced an increasing shortage of housing (e.g., single-family homes and condominium units) caused, inpart, by a low supply of new housing construction that has lagged behind growing consumer demand. The low supply of new housing construction is driven by a combination of factors, including antiquated and manual construction processes that result in elongated construction timelines, as well as an increasing absence of a skilled labor workforce (e.g., skilled construction workers).SUMMARY

[0004] The following introduction is provided to introduce the reader to the more detailed discussion to follow. The introduction is not intended to limit or define any claimed or as yet unclaimed invention. One or more inventions may reside in any combination or sub-combination of the elements or process steps disclosed in any part of this document including its claims and figures.

[0005] In one broad aspect, in accordance with some embodiments, there is generally provided a method for automated verification of a plurality of workpieces. The plurality of workpieces may be located in a predefined area at a physical location and may comprising workpieces of differing dimensions. Additionally, at least some of the plurality of workpieces are in a vertical orientation. The method comprises acquiring, at one or more imaging devices in communication with a processor, one or more input images of the plurality of workpieces at the predefined area. The method further comprises segmenting, at the processor, at least one input image of the one or more input images based on a semantic segmentation model to generate at least onecorresponding semantically segmented image, the at least one semantically segmented image comprising at least one visible segment and at least one masked segment. The method further comprises identifying, at the processor, one or more identified workpiece instances from the at least one input image based on an instance segmentation model, the at least one semantically segmented image and one or more context inputs. The method further comprises generating, at the processor, at least one corresponding instance segmented image based on the one or more identified workpiece instances. The method further comprises determining, at the processor, based on an edge detection model, one or more edges for the one or more identified workpiece instances for the at least one corresponding instance image. The method further comprises mapping, at the processor, each of the one or more edges and the one or more corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration. The method further comprises determining, at the processor, one or more measured characteristics of each workpiece instance based at least on the one or more sets of coordinates. The method further comprises verifying, at the processor, for each workpiece instance, the one or more measured characteristics of the workpiece instance with one or more expected characteristics associated with the predefined location, producing a verification indication. The method further comprises outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface.

[0006] In some embodiments, the plurality of workpieces may be arranged on a support structure.

[0007] In some embodiments, the support structure may comprise a plurality of referencing members for aligning the plurality of workpieces with respect to one another.

[0008] In some embodiments, a subset of referencing members from the plurality of referencing members may cooperate to define at least one channel for receiving at least one workpiece of the plurality of workpieces.

[0009] In some embodiments, the referencing members may comprise at least one of a pin protruding out of a surface of the support structure, and a plate arranged orthogonally to the surface of the support structure.

[0010] In some embodiments, the support structure may comprise a wall, a shelf, or a material cart.

[0011] In some embodiments, the workpiece plane calibration may comprise a calibration board placed on the workpiece plane, the calibration board comprising a pattern readable by the processor to determine at least one of a relative position and a relative orientation

[0012] In some embodiments, the context input may comprise at least one of a positive prompt and a negative prompt, wherein the positive prompt corresponds to a region where workpieces are likely to be present and the negative prompt corresponds to a region where workpieces are likely to be absent.

[0013] In some embodiments, the context input may be predetermined for each predefined area based on characteristics of the imaging device.

[0014] In some embodiments, the one or more measured characteristics may comprise one or more of a length, a width, a height, a position, and a material type.

[0015] In some embodiments, the identified workpieces instances may comprise a lumber instance.

[0016] In some embodiments, the one or more imaging devices may comprise at least one monocular camera.

[0017] In some embodiments, the one or more imaging devices may comprise at least one stereoscopic camera.

[0018] In some embodiments, the one or more imaging devices may comprise at least one monocular camera and at least one stereoscopic camera.

[0019] In some embodiments, the instance segmentation model may be finetuned on one or more training materials, the one or more training materials associated with a material type of at least one workpiece of the plurality of workpieces.

[0020] In some embodiments, the support structure may be colored with one or more pre-selected colors, and the segmenting the at least one input image may be further based on the one or more pre-selected colors.

[0021] In some embodiments, the method may further comprise pre-processing the at least one input image by adjusting at least one brightness characteristic and at least one darkness characteristic of the at least one input image.

[0022] In some embodiments, the semantic segmentation model may comprise a convolutional neural network.

[0023] In some embodiments, the instance segmentation model may comprise a vision transformer.

[0024] In another broad aspect, in accordance with some embodiments, there is generally provided a system for automated verification of a plurality of workpieces. The system comprises one or more imaging devices in communication with a processor, configured to acquire one or more input images of the plurality of workpieces at the predefined area and transmit the one or more input images to the processor, a user interface in communication with the processor, configured to receive a verification indication from the processor and output the verification indication on a display of the user interface, and the processor, configured to receive the one or more input images from the one or more imaging devices, segment at least one input image of the one or more input images based on a semantic segmentation model, generate a corresponding semantically segmented image based on the segmenting, the semantically segmented image comprising at least one visible segment and at least one hidden segment, identify one or more identified workpiece instances from the at least one input image based on an instance segmentation model, the segment image and one or more context inputs, generate at least one corresponding instance image based on the one or more identified workpiece instances, determine, based on an edge detection model, one or more edges and one or more corners for the one or more identified workpiece instances for the at least one corresponding instance image, map each of the one or more edges and the one or more corners to one or more sets of coordinates on a reference plane based on a reference plane calibration, determine one or more measured characteristics of each workpiece instance based on the one or more sets of coordinates, and verify, for each workpiece instance, the one or more measured characteristics of the workpiece instance with one or more expected characteristics associated with the predefined location, producing the verification indication. The plurality of workpieces may belocated in a predefined area at a physical location, the plurality of workpieces may comprise workpieces of differing dimensions, and at least some of the plurality of workpieces may be in a vertical orientation.

[0025] In some embodiments, the plurality of workpieces may be arranged on a support structure.

[0026] In some embodiments, the support structure may comprise a plurality of referencing members for aligning the plurality of workpieces with respect to one another.

[0027] In some embodiments, a subset of referencing members from the plurality of referencing members may cooperate to define at least one channel for receiving at least one workpiece of the plurality of workpieces.

[0028] In some embodiments, the referencing members may comprise at least one of a pin protruding out of a surface of the support structure, and a plate arranged orthogonally to the surface of the support structure.

[0029] In some embodiments, the support structure may comprise a wall, a shelf, or a material cart.

[0030] In some embodiments, the workpiece plane calibration may comprise a calibration board placed on the workpiece plane, the calibration board comprising a pattern readable by the processor to determine at least one of a relative position and a relative orientation

[0031] In some embodiments, the context input may comprise at least one of a positive prompt and a negative prompt, wherein the positive prompt corresponds to a region where workpieces are likely to be present and the negative prompt corresponds to a region where workpieces are likely to be absent.

[0032] In some embodiments, the context input may be predetermined for each predefined area based on characteristics of the imaging device.

[0033] In some embodiments, the one or more measured characteristics may comprise one or more of a length, a width, a height, a position, and a material type.

[0034] In some embodiments, the identified workpieces instances may comprise a lumber instance.

[0035] In some embodiments, the one or more imaging devices may comprise at least one monocular camera.

[0036] In some embodiments, the one or more imaging devices may comprise at least one stereoscopic camera.

[0037] In some embodiments, the one or more imaging devices may comprise at least one monocular camera and at least one stereoscopic camera.

[0038] In some embodiments, the instance segmentation model may be finetuned on one or more training materials, the one or more training materials associated with a material type of at least one workpiece of the plurality of workpieces.

[0039] In some embodiments, the support structure may be colored with one or more pre-selected colors, and the segmenting the at least one input image may be further based on the one or more pre-selected colors.

[0040] In some embodiments, the processor may be further configured to pre- process the at least one input image by adjusting at least one brightness characteristic and at least one darkness characteristic of the at least one input image.

[0041] In some embodiments, the semantic segmentation model may comprise a convolutional neural network.

[0042] In some embodiments, the instance segmentation model may comprise a vision transformer.

[0043] In another broad aspect, in accordance with some embodiments, there is generally provided a method for automated verification of a multi-ply sub-assembly, the multi-ply sub-assembly being located in a predefined area housing a plurality of channels at a physical location and comprising multiple workpieces, comprising: acquiring, at one or more imaging devices in communication with a processor, one or more input images of the multi-ply sub-assembly at the predefined area housing the plurality of channels; classifying, at the processor, for at least one input image of the one or more input images, each of the plurality of channels as occupied or unoccupied based on a classification model, to identify a plurality of occupied channels; generating, at the processor, for each occupied channel in the plurality of occupied channels, a multi-point prompt comprising at least one of a positive prompt and a negative prompt;segmenting, at the processor, the at least one input image of the one or more input images with a segmentation model based on the multi-point prompt to obtain one or more segmentation masks of the multi-ply sub-assembly within each occupied channel; detecting, at the processor, candidate corner points within each segmentation mask with a corner detection algorithm; selecting, at the processor, corners from among the candidate corner points to identify a plurality of selected corners based on a line detection algorithm; identifying, at the processor, a type of the multi-ply sub-assembly based on the selected corners; mapping, at the processor, the selected corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration to obtain mapped corner coordinates; computing, at the processor, one or more measured characteristics of the multi-ply sub-assembly based on the mapped corner coordinates; verifying, at the processor, the one or more measured characteristics of the multi-ply sub-assembly with one or more expected characteristics associated with the predefined location, producing a verification indication; and outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface.

[0044] In some embodiments, the classification model is a support vector classification model.

[0045] In some embodiments, the corner detection algorithm is a Harris or Shi- Tomasi corner detection algorithm.

[0046] In some embodiments, the line detection algorithm is a Hough transform algorithm.

[0047] In some embodiments, the method further comprises identifying and verifying an orientation of the multi-ply assembly based on the plurality of selected corners.

[0048] In some embodiments, the method further comprises determining a number of plies in the multi-ply sub-assembly based on a count of the plurality of occupied channels.

[0049] In some embodiments, the type of the multi-ply sub-assembly is selected from a group comprising: header-sill, king-jack stud, posts, and beam-pocket.

[0050] In another broad aspect, in accordance with some embodiments, there is generally provided a system for automated verification of a multi-ply sub-assembly, comprising: one or more imaging devices in communication with a processor, configured to acquire one or more input images of the multi-ply sub-assembly at a predefined area housing a plurality of channels and transmit the one or more input images to the processor; a user interface in communication with the processor, configured to receive a verification indication from the processor and output the verification indication on a display of the user interface; and the processor, configured to: receive the one or more input images from the one or more imaging devices; classify, for at least one input image of the one or more input images, each of the plurality of channels as occupied or unoccupied based on a classification model, to identify a plurality of occupied channels; generate, for each occupied channel in the plurality of occupied channels, a multi-point prompt comprising at least of a positive prompt and a negative prompt; segment the at least one input image of the one or more input images with a segmentation model based on the multi-point prompt to obtain one or more segmentation masks of the multi-ply sub-assembly within each occupied channel; detect candidate corner points within each segmentation mask with a corner detection algorithm; select corners from among the candidate corner points to identify a plurality of selected corners based on a line detection algorithm; identify a type of the multi-ply sub-assembly based on the selected corners; map the selected corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration to obtain mapped corner coordinates; compute one or more measured characteristics of the multi-ply sub-assembly based on the mapped corner coordinates; and verify the one or more measured characteristics of the multi-ply subassembly with one or more expected characteristics associated with the predefined location, producing a verification indication.

[0051] In some embodiments, the classification model is a support vector classification model.

[0052] In some embodiments, the corner detection algorithm is a Harris or Shi- Tomasi corner detection algorithm.

[0053] In some embodiments, the line detection algorithm is a Hough transform algorithm.

[0054] In some embodiments, the type of the multi-ply sub-assembly is selected from a group comprising: header-sill, king-jack stud, posts, and beam-pocket.

[0055] In some embodiments, the processor is further configured to identify and verify an orientation of the multi-ply assembly based on the plurality of selected corners.

[0056] In some embodiments, the processor is further configured to determine a number of plies in the multi-ply sub-assembly based on a count of the plurality of occupied channels.DRAWINGS

[0057] For a better understanding of the embodiments described herein and to show more clearly how they may be carried into effect, reference will now be made, by way of example only, to the accompanying drawings which show at least one exemplary embodiment, and in which:

[0058] FIG. 1 shows a schematic diagram of an example vision system in accordance with some embodiments.

[0059] FIG. 2 shows a schematic diagram of an example vision system in accordance with some embodiments.

[0060] FIG. 3 shows a block diagram of an example server accordance with some embodiments.

[0061] FIG. 4 shows an example support structure in accordance with some embodiments.

[0062] FIG. 5 shows a flow diagram of an example method for automated verification of a plurality of workpieces in accordance with some embodiments.

[0063] FIG. 6 shows a number of example input images and example semantically segmented images.

[0064] FIG. 7 shows an example instance segmented image in accordance with some embodiments.

[0065] FIG. 8 shows an example input image and corresponding instance segmented image in accordance with some embodiments.

[0066] FIG. 9 shows an example input image with corresponding detected edges in accordance with some embodiments.

[0067] FIG. 10 shows an example calibration square in accordance with some embodiments.

[0068] FIG. 11 shows an example support structure in accordance with some embodiments.

[0069] FIG. 12 shows an example of a multi-ply sub-assembly in accordance with an embodiment.

[0070] FIG. 13 shows an example of a multi-ply sub-assembly in accordance with another embodiment.

[0071] FIG. 14 shows an example of a multi-ply sub-assembly in accordance with a further embodiment.

[0072] FIG. 15A shows an example of a multi-ply sub-assembly in accordance with another embodiment.

[0073] FIG. 15B shows an example of a multi-ply sub-assembly in accordance with a further embodiment.

[0074] FIG. 16 shows a schematic diagram of an example vision system in accordance with an embodiment.

[0075] FIG. 17 shows an example input image of a multi-ply sub-assembly on a support structure in accordance with an embodiment.

[0076] FIG. 18 shows a schematic diagram of an example vision system in accordance with another embodiment.

[0077] FIG. 19 shows a block diagram of an example server in accordance with an embodiment.

[0078] FIG. 20A shows an example image of a multi-ply sub-assembly with example candidate corners in accordance with an embodiment.

[0079] FIG. 20B shows an example image of a multi-ply sub-assembly with example candidate corners in accordance with another embodiment.

[0080] FIG. 21A shows an example image of a multi-ply sub-assembly with example of detected straight lines in accordance with an embodiment.

[0081] FIG. 21 B shows an example image of a multi-ply sub-assembly with example of detected straight lines in accordance with another embodiment.

[0082] FIG. 22A shows an example image of a multi-ply sub-assembly with example of selected corners in accordance with an embodiment.

[0083] FIG. 22B shows an example image of a multi-ply sub-assembly with example of selected corners in accordance with another embodiment.

[0084] FIG. 23 shows an example image of a multi-ply sub-assembly in accordance with an embodiment.

[0085] FIG. 24 shows a flow diagram of an example method for automated verification of a plurality of multi-ply workpieces in accordance with an embodiment.

[0086] Further aspects and features of the example embodiments described herein will appear from the following description taken together with the accompanying drawings.DESCRIPTION OF VARIOUS EMBODIMENTS

[0087] Various embodiments in accordance with the teachings herein will be described below to provide an example of at least one embodiment of the claimed subject matter. No embodiment described herein limits any claimed subject matter. The claimed subject matter is not limited to devices, systems or methods having all of the features of any one of the devices, systems or methods described below or to features common to multiple or all of the devices, systems or methods described herein. It is possible that there may be a device, system or method described herein that is not an embodiment of any claimed subject matter. Any subject matter that is described herein that is not claimed in this document may be the subject matter of another protective instrument, for example, a continuing patent application, and the applicants, inventors or owners do not intend to abandon, disclaim or dedicate to the public any such subject matter by its disclosure in this document.

[0088] For simplicity and clarity of illustration, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thoroughunderstanding of the subject matter described herein. However, it will be understood by those of ordinary skill in the art that the subject matter described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the subject matter described herein. The description is not to be considered as limiting the scope of the subject matter described herein.

[0089] It should also be noted that the terms “coupled” or “coupling” as used herein can have several different meanings depending in the context in which these terms are used. For example, the terms coupled or coupling can have a mechanical, fluidic or electrical connotation. For example, as used herein, the terms coupled or coupling can indicate that two elements or devices can be directly connected to one another or connected to one another through one or more intermediate elements or devices via an electrical or magnetic signal, electrical connection, an electrical element or a mechanical element depending on the particular context. Furthermore, coupled electrical elements may send and / or receive data.

[0090] Unless the context requires otherwise, throughout the specification and claims which follow, the word “comprise” and variations thereof, such as, “comprises” and “comprising” are to be construed in an open, inclusive sense, that is, as “including, but not limited to”.

[0091] It should also be noted that, as used herein, the wording “and / or” is intended to represent an inclusive-or. That is, “X and / or Y” is intended to mean X or Y or both, for example. As a further example, “X, Y, and / or Z” is intended to mean X or Y or Z or any combination thereof.

[0092] It should be noted that terms of degree such as "substantially", "about" and "approximately" as used herein mean a reasonable amount of deviation of the modified term such that the end result is not significantly changed. These terms of degree may also be construed as including a deviation of the modified term, such as by 1 %, 2%, 5% or 10%, for example, if this deviation does not negate the meaning of the term it modifies.

[0093] Furthermore, the recitation of numerical ranges by endpoints herein includes all numbers and fractions subsumed within that range (e.g. 1 to 5 includes 1 ,1 .5, 2, 2.75, 3, 3.90, 4, and 5). It is also to be understood that all numbers and fractions thereof are presumed to be modified by the term "about" which means a variation of up to a certain amount of the number to which reference is being made if the end result is not significantly changed, such as 1 %, 2%, 5%, or 10%, for example.

[0094] Reference throughout this specification to “one embodiment”, “an embodiment”, “at least one embodiment” or “some embodiments” means that one or more particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments, unless otherwise specified to be not combinable or to be alternative options.

[0095] As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. It should also be noted that the term “or” is generally employed in its broadest sense, that is, as meaning “and / or” unless the content clearly dictates otherwise.

[0096] Similarly, throughout this specification and the appended claims the term “communicative” as in “communicative pathway,” “communicative coupling,” and in variants such as “communicatively coupled,” is generally used to refer to any engineered arrangement for transferring and / or exchanging information. Exemplary communicative pathways include, but are not limited to, electrically conductive pathways (e.g., electrically conductive wires, electrically conductive traces), magnetic pathways (e.g., magnetic media), optical pathways (e.g., optical fiber), electromagnetically radiative pathways (e.g., radio waves), or any combination thereof. Exemplary communicative couplings include, but are not limited to, electrical couplings, magnetic couplings, optical couplings, radio couplings, or any combination thereof.

[0097] Throughout this specification and the appended claims, infinitive verb forms are often used. Examples include, without limitation: “to detect,” “to provide,” “to transmit,” “to communicate,” “to process,” “to route,” and the like. Unless the specific context requires otherwise, such infinitive verb forms are used in an open, inclusive sense, that is as “to, at least, detect,” to, at least, provide,” “to, at least, transmit,” and so on.

[0098] The example systems and methods described herein may be implemented as a combination of hardware or software. In some cases, the examples described herein may be implemented, at least in part, by using one or more computer programs, executing on one or more programmable devices comprising at least one processing element, and a data storage element (including volatile memory, nonvolatile memory, storage elements, or any combination thereof). These devices may also have at least one input device (e.g. a keyboard, mouse, touchscreen, or the like), and at least one output device (e.g. a display screen, a printer, a wireless radio, or the like) depending on the nature of the device.

[0099] Some elements that are used to implement at least part of the systems, methods, and devices described herein may be implemented via software that is written in a high-level procedural language such as object-oriented programming. The program code may be written in C++, C#, JavaScript, Python, or any other suitable programming language and may comprise modules or classes, as is known to those skilled in object-oriented programming. Alternatively, or in addition thereto, some of these elements implemented via software may be written in assembly language, machine language, or firmware as needed. In either case, the language may be a compiled or interpreted language.

[0100] At least some of these software programs may be stored on a computer readable medium such as, but not limited to, a ROM, a magnetic disk, an optical disc, a USB key, and the like that is readable by a device having at least one processor, an operating system, and the associated hardware and software that is used to implement the functionality of at least one of the methods described herein. The software program code, when read by the device, configures the device to operate in a new, specific, and predefined manner (e.g., as a specific-purpose computer) in order to perform at least one of the methods described herein.

[0101] Furthermore, at least some of the programs associated with the systems and methods described herein may be capable of being distributed in a computer program product including a computer readable medium that bears computer usable instructions for one or more processors. The medium may be provided in various forms, including non-transitory forms such as, but not limited to, one or more diskettes, compact disks, tapes, chips, and magnetic and electronic storage. Alternatively, themedium may be transitory in nature such as, but not limited to, wire-line transmissions, satellite transmissions, internet transmissions (e.g. downloads), media, digital and analog signals, and the like. The computer useable instructions may also be in various formats, including compiled and non-compiled code.

[0102] As stated in the background, major urban centers increasingly suffer from a shortage of housing. This shortage is due, in-part, to challenges in accessing a skilled labor workforce, as well as more generally, elongated timelines inherent in antiquated and manual construction processes. Similar challenges have also affected new housing supplies in remote, rural and urban areas, which also suffer from an acute lack of available labor to build new houses.

[0103] To this end, it has been appreciated that automated construction techniques may assist in mitigating the lagging supply of new housing infrastructure. For example, automated processes may decrease reliance on a skilled labor workforce and may also expedite construction timelines. Automated construction techniques may also have the benefit of reducing total construction costs.

[0104] The construction industry, including in the field of home panel fabrication, predominantly relies on manual labor. Currently, most home panels are built manually, which can be time consuming, labor intensive, produce inconsistencies in quality, lead to higher chances of human error, and result in potential safety risks. These manual processes can also be less efficient compared to other automated methods in climate- controlled environments, which may not be present in a construction context. Some unique challenges faced in adopting automation in the construction industry include the need for customization in building projects, the variety of materials used, and the often unstructured and changing work environments. Additionally, the frequency of defects in natural materials like wood present a further challenge to construction tasks. Automation of repetitive and labor-intensive tasks like manual inspection may speed up the overall construction process, leading to increased overall efficiency and productivity.

[0105] Systems that integrate various sensors to perceive the surrounding environment can be used in systems for automating construction. Integrating advanced sensing capability may allow for more precise and consistent handling of construction materials and components, thereby reducing the variability of errorsassociated with manual labor. For example, robotic assembly cells can use vision sensors, laser scanners, and force sensors to enhance precision, efficiency, and safety in construction automation. Vision sensors, such as cameras, may be operable to facilitate real-time quality inspection, alignment verification, and collision avoidance by capturing and analyzing images of building components. Laser sensors, which may be located on a robot’s end-of-arm and in other key locations in a robot cell, may assist with detecting and positioning building materials like OSBs, studs, and plates, and may facilitate precise nailing and tool alignment. Force / torque sensors can add another dimension to the input received by robot controllers by monitoring physical interactions taking place during tasks like nailing and sheathing, detecting abnormalities (e.g., misfires, partial nail penetration), preventing collisions, and ensuring proper alignment of components like floor-sheet joints. These types of sensory inputs may be used in a perception system for a robotic assembly system that utilizes visual, spatial, and tactile data, enabling the robotic assembly to perform automated construction processes complex tasks with high accuracy and adaptability.

[0106] Vision systems, in particular, may be helpful in dealing with and addressing human errors before they propagate into larger issues. For instance, incorrect components may be loaded into a cart or staged in an assembly area, which can disrupt the assembly process. Additionally, quality inspections are frequently performed in assembly environments to determine whether correct and compliant building parts are used. However, such inspections are often performed manually, which can be subject to human error.

[0107] The present disclosures may describe vision systems capable of identifying incorrect component loading into various material loading areas, thereby reducing disruptions caused by human errors in the assembly process. The vision systems may, in some embodiments, identify such errors by comparing the loaded components against expected specifications (including the position and dimensions of the components), ensuring the correct parts are in place for assembly. The presently disclosed systems may automate parts of or all of the quality inspection process, helping to detect defects such as knots and cracks in lumber by, for instance, employing computer vision algorithms. The presently disclosed vision system may also ensure accurate placement and alignment of components by detecting edges andcorners of parts that are observed by, for example, a camera. The estimates from the vision algorithm can be used to take corrective action and / or alert human operators with respect to various errors that can be visually detected, such as errors in the position or alignment of a part.

[0108] The present disclosures describe systems that are capable of handling a diversity of tasks necessitated by the varying customizations and design differences from one home to another. The disclosed systems are capable of adapting to changes, allowing the robot to efficiently perform a wide range of customized tasks. The presently disclosed systems may be capable of handling a variety of material types commonly used in construction, such as Laminated Strand Lumber (LSL), Spruce- Pine-Fir (SPF), Oriented Strand Board (OSB), and others. This versatility may be advantageous in a sector where material variability can significantly impact the assembly process. The presently disclosed systems may lead to safer and more efficient construction processes, which can reduce the likelihood of human error affecting operations and enhance the overall quality of the construction.

[0109] Beyond inspecting individual components, the present disclosures also describe an automated verification approach that leverages vision-based techniques to ensure multi-ply sub-assemblies are correctly configured before robotic handling. Modern construction workflows increasingly rely on pre-assembled multi-ply lumber components, such as king-jack studs, beam-pocket assemblies, posts, and headersill sub-assemblies, to accelerate robotic assembly and reduce cycle times. While preassembly improves throughput, it introduces a critical dependency on accurate verification of these complex components prior to robot pickup. Unlike single-ply lumber, multi-ply assemblies exhibit variable aggregate thickness, irregular edge patterns, and elevation differences between plies, which complicate automated inspection. Traditional vision algorithms may fail to reliably identify internal ply boundaries, measure combined dimensions or detect orientation-specific features such as sill protrusions. Human operators often load assemblies incorrectly, such as reversing header-sill orientation, resulting in costly work stoppages, manual intervention, and rework when errors are detected downstream. These challenges underscore the need for a robust, automated verification system capable of handlingmulti-ply assemblies with diverse geometries and orientations under real-world conditions.

[0110] Reference is first made to FIG. 1 , which shows a schematic diagram of an example vision system 100 for automated verification of a plurality of workpieces in accordance with some embodiments of the disclosed invention. System 100 may include a server 102, imaging devices 104, and a user interface device 106. Server 102, imaging devices 104, and user interface device 106 may be connected to each other through network 112. Imaging devices 104 are configured to acquire one or more images of a plurality of workpieces 108. Workpieces 108 can be workpieces for home construction, such as framing lumber, plywood sheets, or any other kinds of workpieces that can be used for construction or in the fabrication of homes.

[0111] The plurality of workpieces 108 may be located in a predefined area 110 at a physical location 114. The physical location 114 may be any location that may house a plurality of workpieces, such as a construction site, an assembly facility, or a warehouse. The predefined area 110 may be one or more designated areas for organized placement of workpieces in the physical location 114, such as a material staging area for arranging the materials for immediate use in an assembly process, or a storage area where the materials may be sorted and organized for future use. Each predefined area may have various characteristics associated therewith, such as information relating to what workpieces should be present and how they should be arranged. For example, a predefined area for assembling wall panels may be associated with a requirement for using small pieces of lumber, cut to pre-determined lengths. The requirements may be specific for each type of wall panel and thus require staging and verification prior to assembly.

[0112] The plurality of workpieces 108 may be comprised of one or more of a single workpiece, or a multitude of different types of workpieces. For example, the plurality of workpieces 108 can contain a mixture of 2x4 boards and oriented strand boards, all of which may be collectively imaged by imaging device 104. The plurality of workpieces 108 may further contain members having differing dimensions from one another. For example, some workpieces 108 may be 2x4 boards and some may be 2x6 boards. Further, the lengths of the boards may vary. It will be understood that in many instances, workpieces of different types, sizes, and shapes may all need to bestored or staged together, for instance, when multiple different types of workpieces need to be assembled together into a single structure comprising different materials.

[0113] In some embodiments, the plurality of workpieces may be supported by a support structure at the predefined area. The support structure may be any structure on which or in which the workpieces may be placed or arranged in an organized fashion at the predefined area. For example, the support structure could be the shelf 400 shown in FIG. 4. The support structure 110 can contain a number of referencing members to facilitate the placement and alignment of the workpieces with respect to the support structure. The referencing members can be pins, dividers, plates, or any other feature of the support structure that helps organize the workpieces. The referencing members can be integral to the support structure or removable, and can be made of plastic, metal, wood, or any other suitable type of material. For example, shelf 400 contains groups of datum pins 402 and 410. The groups of datum pins 402 and 410 are arranged so as to define a plurality of channels for receiving workpieces 406 and 408, respectively. As shown, the datum pins are arranged to form channels of differing sizes depending on which workpiece is intended to be placed within the channel. For example, datum pins 410a, 410b, 410c, 41 Od, 41 Oe are arranged to define a channel for a workpiece 406c. Datum pins 402a, 402b, 402c, 402d, 402e, and 402f are arranged to define a channel for workpiece 408j. As workpieces 406 are wider than workpieces 402, datum pins 410a-410e are spaced out further to form wider channels than the channels defined by datum pins 402a-402f. Additionally, a datum plate 404 is used to support workpieces 402 from the bottom, ensuring that the bottom of each workpiece 402 is positioned an equal height.

[0114] In some embodiments, the support structure may be colored with one or more pre-selected colors. The pre-selected colors may be a specially selected color that facilitates improved computer vision processing, thereby aiding in improved material recognition and positioning. For example, the pre-selected color may improve differentiation between workpiece textures and the support structure by enhancing contrast therebetween and maximizing spectral difference between adjacent objects, thereby improving segmentation performance. As another example, the pre-selected color may improve imaging quality. For example, the shelf 1100 of FIG. 11 can be painted with an anti-reflective blue paint that enhances contrast between theworkpiece and the shelf and reduces potential reflections into the imaging device that may degrade photo quality.

[0115] FIG. 11 shows another example support structure in the form of a shelf 1100. Shelf 1100 contain four rows of datum pins (1102, 1104, 1106, 1108) to support workpieces vertically. Rows 1102 and 1104 cooperate to form channels for slotting workpieces therebetween, as do rows 1106 and 1108. Two plates (1110, 1112) are provided to support workpieces from the bottom. It can be seen that the cooperation of the reference elements allows workpieces to be arranged along a number of channels, in two separate rows, on shelf 1100.

[0116] Referring back to FIG. 1 , the network 112 may be any network capable of carrying data, which may be connected via infrastructure such as Ethernet, RS-485, PROFIBUS, CAN Bus, USB, old telephone service (POTS) line, public switch telephone network (PSTN), integrated services digital network (ISDN), digital subscriber line (DSL), coaxial cable, fiber optics, satellite, mobile, wireless (e.g. Wi-Fi, WiMAX), SS7 signaling network, fixed line, and others, including any combination of these, capable of interfacing with, and enabling communication between any or all of the server 102, the imaging devices 104, and / or the user interface device 106. Each of the server 102, imaging devices 104, and user interface device 106 may be equipped with suitable network communication hardware so as to enable communications with each other through protocols such as MODBUS, IEEE 802.3, IEEE 802.11 , DeviceNet, or any other communications protocol suitable for communication between industrial devices.

[0117] The imaging devices 104 can consist of one or more devices capable of capturing an image, video, or a representation of a plurality of workpieces 108 at the predefined area. For example, the imaging devices 104 could include one or more of a stereoscopic camera (e.g. Stereolabs Zed-2 stereo camera), an industrial monocular RBG-image camera (e.g., Baumer VCGX.2-127C.I), an infrared imaging camera, or any other type of imaging device suitable for capturing images of the plurality of workpieces 108. The one or more imaging devices 104 may consist of a mixture of different types of imaging devices. For example, one or a combination of a monocular camera and a stereo camera can be used, which may enable the capture of moredatapoints about the imaged workpieces, such as stereoscopic images containing depth information about the features captured in the images.

[0118] The imaging devices 104 may be positioned at a defined distance and angle relative to the predefined area. In some embodiments, the imaging device 104 may be repositionable. In some cases, if the imaging device 104 is repositioned, some recalibration may be required. In some embodiments, the imaging devices 104 can be mounted at the end of a robotic arm. In some embodiments, the imaging devices 104 can be mounted on a support structure, such as a ceiling, a rigid pole, or the structure of a robot cell.

[0119] The user interface device 106 can convey information to a user operating the vision system 100 and can accept input from the user for interacting with the system. The user interface device 106 may receive data from the server 102, such as annotated images and image processing results, and display it for the user on a display. The user interface device may additionally be capable of receiving inputs from a user operator. The inputs may, among other functions, be for modifying the settings of the system 100, for running machine vision tasks on the server 102, and for viewing results of previous machine vision tasks. For example, the user interface device 106 may display a user interface for an application module or a configuration settings page where settings for the system can be changed.

[0120] In some embodiments, the user interface device 106 may be a standalone unit capable of displaying information and accepting inputs, such as an industrial HMI (human-machine interface) or a mobile tablet. Such units may, in some cases, include integrated buttons and / or touchscreens to accept input. In some embodiments, the user interface device may consist of multiple devices for display and providing input. For example, the user interface device may comprise a computer monitor combined with input peripherals such as a keyboard and / or a computer mouse, directly connected to server 102.

[0121] In some embodiments, the user interface device 106 will accept data from the server 102 and render its own user interfaces to allow users to view data and interact with server 102. User interface device 106 may also process the inputs and send the processed inputs back to the server 102. For example, in the case where user interface device 106 is a PLC-HMI or a standalone tablet, the user interfacedevice 106 will render its own graphics, and merely needs to receive data to be display on the rendered interfaces. Additionally, the user interface device 106 may be able to process inputs to the system, such as a series of touchscreen presses or button presses, and send back results of the input to the server 102. In some embodiments, the server 102 may be responsible for rendering user interfaces and graphics for the user interface device 106 as well as processing inputs from the user interface device 102, such as when the user interface device 106 simply consists of a computer monitor, computer mouse, and keyboard.

[0122] Server 102 can be a computing device that communicates with the imaging devices 104 and the user interface device 106 through network 112. Server 102 can be an edge computing device so as to reduce latencies associated with communicating with imaging device 104. Server 102 may receive data from imaging devices 104, such as images of workpieces 108 arranged on support structure 110 at a predefined area. The server 102 may process the images using computer vision techniques, which include classical image processing techniques as well as machine learning techniques. Server 102 may be a computing device specially configured to facilitate the operation of computer vision techniques, such as classical algorithms and machine learning models, to process images using machine vision techniques. For example, server 102 may be a Premio VCO-6000-CFL Machine Vision Computer, or any other model of computing device with suitable capabilities of performing machine vision tasks.

[0123] Reference is next made to FIG. 3, which shows a device diagram of an example server 102 of FIG. 1 in accordance with one or more embodiments. Server 102 can include a communication unit 304, an I / O unit 312, a power unit 302, a processor unit 308, and a memory unit 310.

[0124] The communication unit 304 can include any combination of hardware that enables wired or wireless connection capabilities. For example, the communication unit 304 can include a radio or network card that communicates using standards such as IEEE 802.3, IEEE 802.11 , DeviceNet, MODBUS, and any other protocol suitable for communicating with industrial devices. The communication unit 304 can allow the server 102 to communicate with imaging devices 104, user interface device 106, other sensors (e.g., force sensors and torque sensors on a robotic arm),other devices, computers, local area networks, wide area networks, and external networks. In some embodiments, the server 102 can communicate with imaging devices 104 through an Ethernet interface, as enabled by the communication unit 304.

[0125] In the embodiment shown in FIG. 1 , server 102 includes hardware enabling communication with the imaging device 104 and the user interface device 106 through network 112. Server 102 can then communicate with imaging devices 104 and user interface device 106 to send and receive data, such as inputs to the system, image data, alerts, text data, and more. For instance, server 102 may receive one or more images from imaging devices 104 and relay the images to be displayed on user interface device 106. In some instances, the images may be displayed or streamed as a live video feed.

[0126] Server 102 may process a portion of the images from imaging device 104, for example, using machine learning models or classical image processing techniques. Based on the processed images, server 102 can perform additional actions, such as generating alerts and / or annotated images to be displayed on user interface device 106. In some embodiments, imaging devices 104 can directly communicate with the user interface device 106 to display images directly from the imaging device 104 onto user interface device 106 without being routed through server 102.

[0127] The processor unit 308 controls the operation of the server 102. The processor unit 308 can be any suitable processor, controller or digital signal processor that can provide sufficient processing power depending on the configuration, purposes and requirements of the server 102 as is known by those skilled in the art. For example, the processor unit 308 may be a high-performance general processor, such as an Intel® processor or an AMD® processor. The processor unit 308 may preferably contain a graphics card or a dedicated graphics processing unit (GPU), such as an Nvidia® RTC 5000 Ada, for accelerating locally run machine learning models. The provision of a GPU may provide speed advantages for running machine learning algorithms such as material verification, object recognition, defect detection, bit verification, and more. In some embodiments, the processor unit 308 can include more than one processor with each processor being configured to perform different dedicated tasks.

[0128] The I / O unit 312 can include hardware for interfacing with at least one of a mouse, a keyboard, a touch screen, a thumbwheel, a trackpad, a trackball, a cardreader, an audio source, a microphone, voice recognition software and the like again depending on the particular implementation of the server 102. The I / O unit 312 may then facilitate taking user input to the server 102. In some cases, some of these components can be integrated with one another. In some embodiments, the I / O unit 312 may not be used as its functionalities may be combined with user interface device 106. For example, if the user interface device 106 is an industrial PLC-HMI that includes integrated input capabilities in the form of touchscreen controls or buttons built onto the device, no additional I / O peripherals may be needed for the server 102.

[0129] The power unit 316 can be any suitable power source that provides power to the server 102 such as a power adaptor or a rechargeable battery pack depending on the implementation of the server 102 as is known by those skilled in the art.

[0130] The memory unit 310 comprises software code for implementing, among other programs, operating system 320, programs 322, semantic segmentation model 324, instance segmentation model 326, edge detection model 328, and mapping module 330. The memory unit 310 can include RAM, ROM, one or more hard drives, one or more flash drives or some other suitable data storage elements such as disk drives, etc. The memory unit 310 can be used to store an operating system 320 and programs 322 as is commonly known by those skilled in the art.

[0131] The operating system 320 may provide various basic operational processes for the server 102. For example, the operating system 320 may be a server operating system such as Ubuntu® Linux, Microsoft® Windows Server® operating system, or another operating system.

[0132] The programs 322 include various user programs and functional submodules of the vision system 100. They may include applications accessible locally or by accessible by users over the network. For example, programs 322 can include programs for configuring options of the system 100, the server 102, or the various programs models, and modules stored within memory 310. Programs 322 can include application modules. Programs 322 can also include diagnostic tools for testing and troubleshooting connectivity and server health status. Programs 322 can includelogging tools for logging activities and issues with the system and tools to facilitate access to the logs. Programs 322 can include maintenance tools allowing remote reboot functionalities, firmware updates capabilities, and other maintenance abilities that allow for enhanced system longevity and security. Additionally, programs 322 can include programs for interacting with, controlling, and configuring imaging devices 104, such as programs for camera calibration, live feed viewing, and network settings. Additionally, programs 322 can include functional modules for running different machine vision tasks on system 100.

[0133] For instance, a functional module for bit verification may be provided, which can process image inputs to ensure tools are correct. As another example, a pre-cut inspection module can be provided which inspects input images of materials to perform quality assurance. A wall inspection module may be provided, which can analyze images to determine structural integrity of wall assemblies. An ID verification module can be provided, which can use image data to track components across different locations. A table clearance module can be provided, which can analyze image data of a working table to ensure that the work area is clean.

[0134] Programs 322 can include programs for providing user interfaces to interact with the above-described programs, modules, and models. Programs 322 can additionally include programs for displaying the results of executing the functional modules. For example, a user interface may be provided that provides annotated images generated by the functional module and information about the execution of the functional module in JSON format. The user interfaces may be accessible via any display means. For example, the user interfaces may be provided in a web browser format shown on a computer monitor or as an application running on a PLC-HMI. The user interfaces may be rendered and displayed, for example, on user interface device 106.

[0135] Semantic segmentation model 324 produces a plurality of semantic segments based on an input image. The input image may be partitioned into one or more semantic segments, which serve to facilitate the identification or labelling of different objects or regions in the input image. The input image could be segmented based on a number of different classifications. For example, the plurality of semantic segments could correspond to different types of objects present in the image. Inaddition, or in alternative, the different regions could correspond to different material types of the same object (e.g., different types of wood for a wall stud). As another example, the semantic segments could be background objects vs foreground objects.

[0136] The semantic segmentation model 324 may produce the plurality of semantic segments in a number of different ways. For example, the plurality of semantic segments could be produced as a semantically segmented image. Different pixel values of the image may correspond to different segments. For example, where there are three different classes of objects visible in an image, a first color may be assigned to one class of object, a second color to a second class of object, and a third color to a third class of object. In the example of a simple binary classification such as foreground object vs background object, pixels corresponding to foreground objects can be set at one value while background objects can be set to another value. In some embodiments, a masked image may be produced wherein one or more non-relevant object classes (e.g. background objects) can be masked while one or more classes of interest (e.g. foreground objects or other objects of interest) remain visible. FIG. 6 depicts a series of example input images 602 to a semantic segmentation model and corresponding output images 604. Input images show different views of lumber workpieces 606 arranged in various positions against a shelf support structure 608. Output images 604a, 604b, 604c show background subtracted versions of input images 602a, 602b, 602c, respectively. The objects of interest (i.e., lumber workpieces) are shown in one color in the images while the background objects that are not of interest (i.e., shelf and floor) have been masked and are shown as black in the images.

[0137] In some embodiments, the plurality of semantic segments can be output as contours, that can be defined by vectors, separating different segments from each other. In some embodiments, the various semantic segments can be produced as separate images. It should be appreciated that the exact data structure used to represent the segments does not matter, so long as the segments can be created and read by the system 100 for computer vision processing tasks.

[0138] Semantic segmentation model 324 may contain a trained machine learning model configured to process images. The machine learning model can use deep learning architectures such as convolutional neural networks (CNN), recurrentneural networks, transformers, variational autoencoders, generative adversarial networks, and any other type of deep learning architecture as known to a person of skill in the art. For example, a CNN-based machine learning architecture such as ResNet can be used. However, any other suitable model architecture may be used, such as, for example, VGGNet, ResNet, GoogLeNet, Long Short-Term Memory, GPT, or Vision Transformer (ViT). The machine learning model may take in an image input and produce, as an output, a semantically segmented image.

[0139] Instance segmentation model 326 may identify one or more individual workpiece instances from an input image containing a plurality of workpieces. The instance segmentation model 326 may take the input image as input to produce an identification of the individual instances. Optionally, the instance segmentation model 326 may additionally take, as inputs, a semantically segmented image and one or more context inputs to improve instance identification. The instance segmentation model may then segment the input image in accordance with the locations of individual instances of the workpieces. For example, FIG. 8a shows an example input image 800a to the instance segmentation model and FIG. 8b shows a corresponding instance segmented image 800b. As shown, workpiece instances 810 and 812 are correctly identified as individual workpieces and have been highlighted separately despite being positioned directly next to one another.

[0140] The instance segmentation model may generate a corresponding instance segmented image based on the identified individual workpiece instances, such as image 800b of FIG. 8b. The instance segmented image may encode information about the identified workpiece instances. For example, each pixel of the instance segmented image can contain a value indicating which instance it belongs to. Referring to FIG. 8b, all pixels associated with workpiece 810 may be set at a first pixel value and pixels associated with workpiece 812 may be set at a second pixel value.

[0141] The instance segmentation model may use a semantically segmented image to facilitate improved instance detection. For example, instance segmentation model 326 may use the semantically segmented images 604a-604c of FIG. 6, generated by the semantic segmentation model 324, as a mask input to determine which areas in the input image the workpiece are located in. Then, the instancesegmentation model can focus on performing instance identification only in relevant areas, and not in, for example, background areas that contain no workpiece instances. Images 604a-604c contain information about whether a pixel contains a workpiece, as background objects (non-workpiece images) have been masked out. Instance segmentation model 326 may then identify, within the visible areas, one or more individual workpiece instances. For example, FIG. 7 shows an example instance segmented image 700 containing 5 identified instances 710a-710e. FIG. 8b shows another example instance segmented image 800b with a number of identified workpiece instances visible. The pixels of each identified workpiece instance may be associated with a different value depending on which instance the pixel is identified to be associated with.

[0142] Instance segmentation model 324 may contain a trained machine learning model configured to process images. The machine learning model can use deep learning architectures such as convolutional neural networks (CNN), recurrent neural networks, transformers, variational autoencoders, generative adversarial networks, and any other type of deep learning architecture as known to a person of skill in the art. For example, a pre-trained vision transformer that is fine-tuned to detect workpiece instances may be used. However, any other suitable model architecture may be used, such as, for example, VGGNet, ResNet, GoogLeNet, Long Short-Term Memory, GPT, or ResNet. The machine learning model may take in an image input and produce, as an output, an instance segmented image. The machine learning model may further be fine-tuned specifically on a training dataset containing labelled instances of relevant workpieces.

[0143] Instance segmentation model may use one or more context inputs to facilitate improved instance detection. The context inputs can contain some prior knowledge of the predefined area. In particular, the context inputs may provide information as to which areas of the image are more likely to contain workpieces vs which areas are less likely to contain workpieces. The context inputs can be in the form of point prompts, including positive prompts to indicate areas likely containing workpieces and negative prompts to indicate areas likely to not contain workpieces. For example, of the angle, position, and field of view of the imaging device capturing the image is fixed and known in advance, areas less likely to contain workpieces andareas more likely to contain workpieces can be identified ahead of time. For example, referring to the example input image 800a in FIG. 8a, it can be known ahead of time, based on knowing the layout of shelf 802, that there are unlikely to be any workpieces below datum plate 802. For example, there are unlikely to be workpieces below datum plate 804, so the region below datum plate 804 can be negatively prompted to indicate the absence of workpieces. Additionally, a maximum height of the workpieces 806 positioned on top of the datum plate 804 may be known ahead of time. As such, a positive prompt can be generated encompassing the area in the image that the workpieces would likely occupy at most.

[0144] Edge detection model 328 may generate edges of the workpiece instances based on the identified workpiece instances. The edges may be x-y coordinates corresponding to pixels where an edge or a corner of an instance exists. The edge detection model 328 may use an instance segmented image produced by instance segmentation model 326 to produce the edges and corners. For example, FIG. 9 shows an example workpiece instance 904, with edges 902 surrounding the edges of the instance. Edges 902 represent pixels where an edge of instance 904 is determined to exist.

[0145] Edge detection model 328 may generate the edges by using any number of edge detection techniques. For instance, classical methods such as Sobel operators, Prewitt operators, Scharr operators, Difference-of-Gaussians, Canny Edge Detection, and another other technique as is known to a person of skill in the art may be used. Additionally, morphological dilation and erosion methods may be used. Frequency domain methods such as Fourier transforms and Wavelet transforms may be used. Machine learning-based techniques may also be used. For example, deep learning-based techniques using convolutional neural networks may be used. In some embodiments, a clustering method may be used to enhance straightness of the detected edges. Any clustering method known to a person skilled in the art may be used, including algorithms such as, but not limited to, Hough Transforms, K-Means Clustering, DBSCAN, and RANSAC.

[0146] Mapping module 330 may take edges generated by edge detection model 328 and mathematically transform them from a coordinate in the image pixel space to a coordinate on a workpiece plane space. For the purposes of thetransformation, the imaged workpieces may be presumed to be resting against a common plane. The workpiece plane may then correspond to the common plane that the workpieces are resting against. In some embodiments, the position and orientation of the workpiece plane (e.g., the wall, floor, support structure, or other plane supporting the workpieces) relative to the imaging device may be previously known. In such cases, the system may be calibrated based on a known position, angle, and field of view of the imaging device used, and the known position of the workpiece plane, to project coordinates from image pixel space to the workpiece plane space. The projection may be faster and more robust in comparison to general 3D-space techniques, such as, but not limited to, depth-based projection using a camera model, which may be error-prone due to variable shapes and orientations typical in a construction environment.

[0147] In some embodiments, the position and orientation of the workpiece plane relative to the imaging device may not previously be known. In such cases, the mapping module may be configured to perform a calibration at runtime based on known reference elements, such as landmarks, patterns, wireless signals, or any other means of detecting depth. For example, a pattern may be provided. Referring to FIG. 10, an example image 1000 of an example reference element is shown. A calibration square 1004 is provided on the surface of a rack 1002 to indicate position and orientation of the workpiece plane relative to the imaging device. The calibration square 1004 may contain a pattern that is readable by mapping module 330 to determine the depth, and orientation of the workpiece plane. For example, the mapping module may determine a skew, a scaling, a rotation, or any other transformation or distortion of the calibration square with respect to a known calibrated position of the reference square. Based on these determinations, the relative orientation and position of the workpiece plane can be determined. In some embodiments, the imaging device 104 can contain stereo cameras capable of determining depth information about images that are captured therewith. This depth information may be used by the mapping module to calibrate the reference plane.

[0148] Image processing module 332 may perform simple adjustments and transformations to an image, including cropping, skewing, rotating, and adjusting brightness and contrast for the image. Image processing module 332 may pre-processinput images received from imaging device 104 before processing using other described modules. For example, image processing module 332 may automatically adjust the brightness and contrast of received images prior to semantic segmentation to enhance the semantic segmentation model’s ability to discern between different areas of the image. Image processing module 332 can be include one or more programs, scripts, or computer-implemented tools for image editing. For example, image processing libraries such as OpenCV, scikit-image, and NumPy could be used. The libraries may be integrated, for example, into a script written with Python. Alternatively, image processing module 332 may be modules implemented in a machine learning framework with image adjustment capabilities such as TensorFlow or PyTorch.

[0149] Reference is next made to FIG. 2, which shows the example vision system 100 of FIG. 1 in accordance with some alternative embodiments, operating in a physical location 114 that contains multiple predefined areas 110. While predefined areas 110a and 110b are explicitly shown, it will be appreciated that any number of pre-defined areas may exist in the physical location 114. System 100 may perform material verification and other computer vision processing tasks for some or all of the predefined areas, including areas 110a, 110b, and various other predefined areas that are not necessarily shown. In order to accomplish this, a plurality of imaging devices 104 may be used. For example, in one or more imaging devices 104 may be provided for each predefined area. Alternatively, one or more imaging devices 104 may cover multiple predefined areas. For example, an imaging device 104 may be mounted on the end of a robot arm or a swiveling base. The imaging device 104 may then be repositionable so as to be able to capture multiple views, including views of two or more predefined areas.

[0150] System 100 may contain multiple user interface devices 106. User interfaces devices 106a and 106b are shown in FIG. 2, which may correspond to predefined areas 110a and 110b, respectively. A user interface device 106 may be provided for each predefined area 110. The user interface device 106 may display data specific to the predefined area. For example, a user interface device may display annotated images or alerts relating to the predefined area it is associated with, as produced by the vision system. In some embodiments, one user interface device maybe provided for multiple predefined areas. In such cases, the user interface device 106 may provide capability to switch between the different predefined areas.

[0151] System 100 may additionally contain one or more servers 102. Even though only one server 102 is shown in FIG. 2, it will be appreciated that any number of servers 102 can be used and connected to the network 112.

[0152] Reference is next made to FIG. 5, which shows a method 500 for automated verification of a plurality of workpieces located in a predefined area at a physical location. Method 500 may help ensure that certain characteristics of the workpiece, such as type and dimension, used in the robotic assembly process at the predefined location is verified against information associated with specific task. Additionally, method 500 may facilitate the estimating of an accurate position of a component for a pick-place task.

[0153] Workpieces 108 may contain members having varying dimensions from one another, such as varying lengths, thicknesses, or shapes. Method 500 may be implemented by, for example, by system 100 of FIG. 1 and FIG. 2, to which references will be made in conjunction with FIG. 5, on workpieces 108 located in a predefined location 110 at a physical location 114. Specifically, imaging devices 104, user interfaces 106, and server 102 may perform method 500 in such instances. It will be appreciated that while method 500 may describe performing automated material verification at one predefined area, method 500 may be run in parallel for many different predefined areas at a physical location, such as, for example, in the example shown in FIG. 2, where method 500 can be performed by system 100 on predefined areas 110a and 110b simultaneously. In such instances, imaging devices 104a, user interface device 106a, and server 102 may be configured to perform method 500 with respect to workpieces 108a-108c at predefined area 110a while imaging devices 104b, user interface device 106b, and server 102 may be configured to perform method 500 with respect to workpieces 108d-108f at predefined area 110b.

[0154] In some embodiments, the plurality of workpieces may be arranged on a support structure. As described above, the support structure may be any structure on which or in which the workpieces may be placed or arranged in an organized fashion at the predefined area. The support structure could, for example, be adesignated wall, a shelf, or a material cart. The support structure could, for example, be shelf 400 shown in FIG. 4, or shelf 1100 of FIG. 11 .

[0155] In some embodiments, the support structure may comprise a plurality of referencing members for aligning the plurality of workpieces with respect to one another. The referencing members can facilitate the placement and alignment of the workpieces with respect to the support structure. In some instances, the referencing members may be carefully aligned to maintain a consistent datum or zero-reference point, for precise positioning and alignment. The referencing members can be pins, dividers, plates, or any other feature of the support structure that helps organize the workpieces. The referencing members can be integral to the support structure or removable, and can be made of plastic, metal, wood, or any other suitable type of material. In some embodiments, the referencing members may include pins protruding out of a surface of the support structure and / or a plates arranged orthogonally to the surface of the support structure. For example, the referencing members can be pins, such as datum pins 1102, 1104, 1106, 1108 of FIG. 11 or pins 402, 410 of FIG. 4. The referencing members can also be plates, such as datum plate 404 of FIG. 4 and datum plates 11 10, 1112 of FIG. 1 1.

[0156] In some embodiments, a subset of referencing members from the plurality of referencing members may cooperate to define at least one channel for receiving at least one workpiece of the plurality of workpieces. For example, with reference to FIG. 4, shelf 400 contains groups of datum pins 402 and 410. The groups of datum pins 402 and 410 are arranged so as to define a plurality of channels for receiving workpieces 406 and 408, respectively. As shown, the datum pins are arranged to form channels of differing sizes depending on which workpiece is intended to be placed within the channel. For example, datum pins 410a, 410b, 410c, 41 Od, 41 Oe are arranged to define a channel for a workpiece 406c. Datum pins 402a, 402b, 402c, 402d, 402e, and 402f are arranged to define a channel for workpiece 408j. As workpieces 406 are wider than workpieces 402, datum pins 41 Oa-41 Oe are spaced out further to form wider channels than the channels defined by datum pins 402a-402f. Additionally, a datum plate 404 is used to support workpieces 402 from the bottom, ensuring that the bottom of each workpiece 402 is positioned an equal height.

[0157] In some embodiments, the support structure may be colored with one or more pre-selected colors. Further, the segmenting the at least one input image may further be based on the one or more pre-selected colors. The pre-selected colors may be a specially selected color that facilitates improved computer vision processing. For example, the pre-selected color may improve differentiation between workpiece textures and the support structure. As another example, the pre-selected color may improve imaging quality. For example, the shelf 1100 of FIG. 11 can be painted with an anti-reflective blue paint that enhances contrast between the workpiece and the shelf and reduces reflections into the imaging device that may degrade photo quality. The pre-selected color may help the semantic segmentation model in distinguishing workpieces from other background objects. In some embodiments, the identifying the one or more identified workpiece instances may also be further based on the one or more pre-selected colors. For example, the pre-selected color may facilitate improved performance in the instance segmentation model in identifying individual workpieces.

[0158] The method begins at 502 with acquiring, at one or more imaging devices in communication with a processor, one or more input images of the plurality of workpieces at the predefined area. For example, with reference to FIG. 1 , the imaging devices 104 may image a plurality of workpieces 108a-c in the predefined area 110. As another example, referring to FIG. 2, one of imaging devices 104a or 104b may be imaging the workpieces located at predefined areas 110a or 110b, respectively. The imaging device 104 may be in communication with processor unit 308 of server 102, as shown in FIG. 3. The input images may then be transmitted to the server 102 for further processing at processor unit 308.

[0159] In some embodiments, the one or more imaging devices may include at least one monocular camera. For example, imaging device 104 of FIGS. 1 and 2 may include an industrial camera such as the Baumer VCGX.2-127C.1. In some embodiments, the one or more imaging devices may include at least one stereoscopic camera, and the one or more input images comprises at least one depth-containing image. For example, imaging device 104 of FIGS. 1 and 2 may include a stereoscopic camera such as the Stereolabs Zed-2 stereo camera. The stereoscopic camera may be configured to capture stereoscopic images, which may contain depth information about features captured in the images. In some embodiments, the one or moreimaging devices can include at least one monocular camera and at least one stereoscopic camera. For example, the imaging devices 104 can be a combination apparatus that includes a combination of a monocular camera and a stereo camera. The cameras may be disposed directly adjacent to one another and be configured to take simultaneous or successive images.

[0160] In some embodiments, the at least one input image may be pre- processed by adjusting at least one brightness characteristic and at least one darkness characteristic of the at least one input image. For example, image processing module 332 can be used to pre-process the input images captured by imaging device 104 before the input images are processed using segmentation model 324 and instance segmentation model 326. The preprocessing may entail adjusting the brightness and contrast of the image so as to facilitate improve segmentation of the image in subsequent processing stages. For example, dark or bright spots in the image can be removed.

[0161] The method proceeds to 504 with segmenting, at the processor, at least one input image of the one or more input images based on a semantic segmentation model to generate at least one corresponding semantically segmented image, the at least one semantically segmented image comprising at least one visible segment and at least one masked segment. For example, server 102 may operate a semantic segmentation model 324, as shown in FIG. 3, to segment the input images. As described above, the semantic segmentation model 324 can be a trained machine learning model configured to segment the input images into one or more segments, with each segment containing a different class of content. For example, the input images may be segmented into foreground objects (e.g., lumber workpieces) and background objects (e.g., shelf, wall, and floor). For example, referring to FIG. 6, semantically segmented images 604a-604c may be generated by semantic segmentation model 324 based on input images 602a-602c. As shown, workpieces visible in the input images 602, such as workpiece 606a of image 602a, correspond to visible segments in the resulting semantically segmented images, such as segment 612a in image 604a.

[0162] The method proceeds to 506 with identifying, at the processor, one or more identified workpiece instances from the at least one input image based on aninstance segmentation model, the at least one semantically segmented image and one or more context inputs. The instance segmentation model may be the instance segmentation model 326 of FIG. 3. Individual workpiece instances contained in the input image from imaging device 104 may be identified. The identified workpieces instances may be, for example, a lumber joist. The identification of the individual workpiece instance from the input image may be assisted by the semantically segmented image. For example, the semantically segmented image may identify locations of workpieces. The instance segmentation image may focus instance identification in the areas where workpieces are identified, and not, for example, on areas containing background objects.

[0163] Additional context inputs may be provided to the instance segmentation model. The context inputs may contain some prior knowledge of the predefined area. In particular, the context inputs may provide information on which areas of the image are more likely to contain workpieces and which areas are less likely to contain workpieces. In some embodiments, the context input may be predetermined for each predefined area based on characteristics of the imaging device. For example, at certain predefined areas, the position, orientation, and field of view of the imaging device capturing the input images may be known. Additionally, the layout of the support structure for the devices may be known ahead of time. Thus, it may be possible to predetermine which areas in the input images are likely to contain workpieces and which areas are unlikely to obtain workpieces. This information can, in turn, be provided to the instance segmentation model to facilitate improved instance detection. This may improve the robustness of the instance segmentation by reducing false detections and missed instances. In some embodiments, the context input may be in the form of at least one positive prompt or negative prompt, wherein the positive prompt corresponds to a region where workpieces are likely to be present and the negative prompt corresponds to a region where workpieces are likely to be absent. For example, the prompts may be in the form of point prompts, where each pixel is associated with a likelihood of workpiece presence.

[0164] In some embodiments, the instance segmentation model may be finetuned on one or more training materials, the one or more training materials associated with a material type of at least one workpiece of the plurality of workpieces. Forexample, the instance segmentation model may be fine-tuned on a curated and annotated dataset consisting of labelled images of different types workpieces arranged at various predefined areas, annotated with labels distinguishing the different instances from one another. This may serve to improve the performance of the model in distinguishing and segmenting each individual workpiece from one another.

[0165] The method proceeds to 508 with generating, at the processor, at least one corresponding instance segmented image based on the one or more identified workpiece instances. For example, image 700 of FIG. 7 shows five different identified workpiece instances 710a-710e, generated based on input image 602b. As another example, image 800b of FIG. 8b shows various workpiece instances identified, generated from input image 800a of FIG. 8a.

[0166] The method proceeds to 510 with determining, at the processor, based on an edge detection model, one or more edges for the one or more identified workpiece instances for the at least one corresponding instance image. The edges may be x-y coordinates corresponding to pixels where an edge or a corner of an instance exists. The edge detection model may be edge detection model 326 of FIG. 3, which may be configured to generate edges corresponding to the identified workpiece instances in the input image. The edge detection model 328 may use an instance segmented image produced by instance segmentation model 326 to produce the edges and corners. For example, FIG. 9 shows an example workpiece instance 904, with edges 902 surrounding the edges of the instance. Edges 902 represent pixels where an edge of instance 904 is determined to exist.

[0167] The method proceeds to 512 with mapping, at the processor, each of the one or more edges and the one or more corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration. The mapping may be performed by the mapping module 330 of FIG. 3. The edges, which may be comprised of coordinates in the image’s pixel space, may be mapped to a workpiece plane, which may be a mathematical representation of the plane that the workpieces are resting on (e.g., the surface of a shelf, cart, or wall). A workpiece plane calibration may be performed by the system 102 in order to determine the transformation required to map the coordinates from pixel space to the workpiece plane.

[0168] In some embodiments, the position and orientation of the workpiece plane (e.g., the wall, floor, support structure, or other plane supporting the workpieces) relative to the imaging device may be previously known. In such cases, the system may be calibrated based on a known position, angle, and field of view of the imaging device used, and the known position of the workpiece plane, to project coordinates from image pixel space to the workpiece plane space.

[0169] In some embodiments, the workpiece plane calibration may involve the use of a calibration board placed on the workpiece plane. The calibration board may contain a pattern readable by the processor to determine at least one of a relative position and a relative orientation. For example, FIG. 10 shows an example calibration board 1004, which contains a pattern that can be imaged by imaging devices 104 and analyzed by server 102, to determine the relative position and orientation of the camera to the workpiece plane. By using this calibration method, the locations of the cameras and workpieces need not be known in advance, as the workpiece plane can be determined at runtime by analyzing the calibration square. This may also allow a support structure supporting the workpieces, such as a cart, to be movable, as the workpiece plane can be recalibrated at each new location of the support structure.

[0170] The method proceeds to 514 with determining, at the processor, one or more measured characteristics of each workpiece instance based at least on the one or more sets of coordinates. The one or more measured characteristics could include a position, orientation, length, width, height, shape, material type, or any other characteristic associated with the imaged workpieces. As the edges represent an estimation of a workpiece instance’s position on the surface or support structure the workpiece is resting on, physical characteristics such as dimensions, shapes, material type, position, and orientation of the workpiece may be determined using the edges. In some embodiments, other information can be used to further facilitating determining measured characteristics. For example, pixel color data from the input images can be used to help determine the material type of a workpiece.

[0171] The method proceeds to 516 with verifying, at the processor, for each workpiece instance, the one or more measured characteristics of the workpiece instance with one or more expected characteristics associated with the predefined location, producing a verification indication. The expected characteristics for apredefined location may set out limitations for acceptable characteristics of workpieces associated with the respective predefined location, such as limitations on dimensions, acceptable and not acceptable types of materials, and rules on how materials should be placed. For example, an assembly station may have a maximum length requirement for the workpieces that can be used at that station. Thus, an expected characteristic for the material staging area may include a maximum length requirement for workpieces. If a workpiece is determined to have a measured characteristic (i.e., length) exceeding that expected characteristics (i.e., maximum length), then the workpiece may be deemed non-compliant. The expected characteristics may be locally stored in memory on server 102 or may be stored in a database or other data storage structure remotely. In some embodiments, the server 102 can update the expected characteristics for the predefined location from a remote source. For example, at runtime, the server 102 can query a remote location to determine what the expected characteristics associated with the predefined location are, or to download an update to a stored version of the expected characteristics.

[0172] The verification indication may be an indication of the result of the verification. The verification indication could be a binary condition such as pass or fail. The verification indication could also be a percentage, such as an indicator of what percentage of workpieces were deemed compliant or non-compliant. In some embodiments, the verification indication could contain more complex information. For example, an annotated image may be produced that shows the result of the verification for each workpiece, wherein different colors, such as green or red, could be used to indicate compliance and non-compliance.

[0173] The method proceeds to 518 with outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface. The user interface may be the user interface device 106, which contains a display capable of displaying the verification indication to a user. For example, if a workpiece is determined to be non-compliant, the verification indication generated at step 516 could be a fail, and the user interface device 106 can generate an alert on its display indicating the failure. Thus, a human operator, for example, an operator of the assembly station, may be able to address the issue by manual intervention by correcting the non-compliant workpiece. As another example, the verificationindication may be an annotated image that shows the result of a verification for each workpiece, and the user interface device 106 may display the annotated image so that a user operator can determine precisely where the point of issue lies.

[0174] In some embodiments, the vision systems may employ vision algorithms for the automated verification of multi-ply lumber sub-assemblies in industrial construction environments. Multi-ply sub-assemblies, such as king-jack studs or header-sill sub-assemblies, may be pre-assembled by fastening multiple pieces (plies) before robotic assembly to facilitate faster throughput on the assembly line.

[0175] Reference is made to FIGS. 12-16, which show examples of multi-ply sub-assemblies that have been fabricated in advance to save robotic assembly time. FIG. 12 shows king-jack sub-assembly 1202 comprising a king stud 1204 and a jack stud 1206. FIG. 13 shows a beam-pocket assembly 1302 comprising a first beam 1304, a second beam 1306, and a third beam 1308. FIG. 14 shows a posts subassembly 1402 comprising a first post 1404, a second post 1406, a third post 1408, a fourth post 1410, and a fifth post 1412. FIG. 15A shows a header-sill sub-assembly 1502 comprising a horizontal header 1504 and an attached sill 1506, which is in the right-sill orientation. FIG. 15B shows a header-sill sub-assembly 1503 in the left-sill orientation.

[0176] For a multi-ply sub-assembly, the automated verification process verifies that one or more multi-ply sub-assemblies loaded onto the cart contains the correct combination of lumber pieces in both dimensions and order. For example, a correctly assembled and oriented king-jack sub assembly 1202 may require king stud 1204 to be on the left side of jack stud 1206 for a particular step in the overall assembly process. In another example, header-sill sub-assembly 1502 or 1503 must be correctly oriented to satisfy framing requirements. As part of automated verification, it is necessary to determine whether the sill is located on the right or left side of the header. An incorrect orientation prevents proper interfacing with the corresponding king-jack stud assembly. Human operators may inadvertently load header-sill assemblies into precut carts in the wrong orientation. Therefore, automated verification before robot pickup is essential to prevent downstream assembly failures. Detecting an incorrect orientation only after the robot has begun assembly results in costly work stoppages, as it requires manual intervention and restarting the sequence.

[0177] Reference is made to FIG. 16, which shows a schematic diagram of an example vision system 1600 for automated verification of a multi-ply sub-assembly in accordance with some embodiments of the disclosed invention. System 1600 is analogous to system 100 illustrated in FIG. 1 . System 1600 may include a server 1602, imaging devices 1604, and a user interface device 1606. Server 1602, imaging devices 1604, and user interface device 1606 may be connected to each other through network 1612. Imaging devices 1604 are configured to acquire one or more images of multi-ply sub-assembly 1608, which may comprise workpieces 1608a and 1608b. Workpieces 1608a and 1608b can be workpieces for home construction, such as framing lumber, plywood sheets, or any other kinds of workpieces that can be used for construction or in the fabrication of homes.

[0178] The multi-ply sub-assembly 1608 may be located in a predefined area 1610 at a physical location 1614. The physical location 1614 may be any location that may house a plurality of workpieces, such as a construction site, an assembly facility, or a warehouse. The predefined area 1610 may be one or more designated areas for organized placement of workpieces in the physical location 1614, such as a material staging area for arranging the materials for immediate use in an assembly process, or a storage area where the materials may be sorted and organized for future use. Each predefined area may have various characteristics associated therewith, such as information relating to what workpieces should be present and how they should be arranged. For example, a predefined area for assembling wall panels may be associated with a requirement for using small pieces of lumber, cut to pre-determined lengths.

[0179] In some embodiments, the multi-ply sub-assembly may be supported by a support structure at the predefined area. The support structure may be any structure on which or in which the workpieces may be placed or arranged in an organized fashion at the predefined area. For example, the support structure could be the shelf or a rack 1500 shown in FIG. 17. The support structure 1610 can contain a number of referencing members to facilitate the placement and alignment of the multi-ply assembly with respect to the support structure. The referencing members can be pins, dividers, plates, or any other feature of the support structure that helps organize the workpieces. The referencing members can be integral to the support structure orremovable, and can be made of plastic, metal, wood, or any other suitable type of material. For example, shelf 1500 contains referencing members 1508 and a group of datum pins 1510. The group of datum pins 1510a, 1510b, 1510c, and 151 Od are arranged so as to define a channel for receiving header-sill sub-assembly 1502. Referencing members 1508a, 1508b, and 1508c may be arranged so as to define channels for receiving other multi-ply assemblies, such as, for example, the king-jack stud 1202.

[0180] In some embodiments, the referencing members and datum pins are arranged to form channels of differing sizes to receive certain target multi-ply subassemblies within the channel. In some other or additional embodiments, the referencing members and datum pins are arranged to form channels at certain predefined locations on the rack 1500 to receive certain target multi-ply subassemblies within those channels. In such embodiments, the referencing members and datum pins are arranged to form channels of differing sizes depending on which sub-assembly is intended to be placed within the channel. In some embodiments, the referencing members are retractable so that multi-ply sub-assemblies or single-ply workpieces of various dimensions can be accommodated in the same area of the shelf. In some embodiments, the referencing members are laterally adjustable, such as, for example, the referencing members are movable side-to-side.

[0181] Reference is next made to FIG. 19, which shows a block diagram of an example server 1602 of FIG. 16 in accordance with one or more embodiments. Server 1602 can include a communication unit 1804, an I / O unit 1812, a power unit 1802, a processor unit 1808, and a memory unit 1810, which are analogous to communication unit 304, I / O unit 312, power unit 302, processor unit 308, and memory unit 310 of FIG. 3, respectively.

[0182] Server 1602 may also comprise an operating system 1620, programs 1622, image processing module 1632, and mapping module 1630, which may be analogous to operating system 320, programs 322, image processing module 332, and mapping module 330 of FIG. 3, respectively.

[0183] Server 1602 may additionally comprise a classification model 1823, a segmentation model 1824, a corner detection module 1828, and a corner selection module 1829.

[0184] Classification model 1823 may perform background subtraction and determine whether each predefined channel in the support structure is occupied by a workpiece or is empty. The model may operate on pre-processed image data to isolate regions corresponding to lumber material from background regions such as the shelf or cart surface. After background subtraction, the classification model outputs a binary indication for each channel, such as “occupied” or “unoccupied.” In some embodiments, the classification model may comprise a support vector classifier trained on features extracted from channel regions, including pixel intensity distributions, edge density, and color histograms. The model may use a kernel-based approach (e.g., radial basis function) to separate occupied channels from empty channels in feature space. The output of the classification model may be used to infer the number of plies in a multi-ply assembly by counting consecutive occupied channels. In some embodiments, the classification model may be implemented using scikit-learn or an equivalent machine learning framework. Performing background subtraction and channel occupancy detection prior to segmentation reduces computational load by limiting subsequent processing to occupied channels.

[0185] The segmentation model 1824 may be a SAM segmentation model. The SAM segmentation model may generate segmentation masks for workpieces based on multi-point prompts derived from detected channel positions. The prompts may be automatically generated from datum points defining the channel boundaries. The SAM model may accept positive prompts corresponding to regions where workpieces are likely to be present and negative prompts corresponding to regions where workpieces are likely absent. For example, if the angle, position, and field of view of the imaging device capturing the image is fixed and known in advance, areas less likely to contain workpieces and areas more likely to contain workpieces can be identified ahead of time. For example, referring to the example input image in FIG. 17, it can be known ahead of time, based on knowing the layout of shelf 1500, that there are unlikely to be any workpieces below datum plate 1512. Therefore, the region below datum plate 1512 can be negatively prompted to indicate the absence of workpieces. Additionally, a maximum height of the sub-assembly 1502 positioned on top of the datum plate 1512 may be known ahead of time. As such, a positive prompt can be generated encompassing the area in the image that the workpieces would likely occupy at most.

[0186] The SAM model may be configured to produce coarse segmentation boundaries that approximate the edges of lumber pieces and identify intersection points where edges meet. These intersection points may serve as candidate corner locations for subsequent refinement. In some embodiments, the SAM model may comprise a transformer-based architecture pre-trained on large-scale image datasets and fine-tuned for lumber segmentation tasks. The output of the SAM model may include one or more segmentation masks delineating the approximate contours of each workpiece within the channel. These masks may be used to initialize enhanced corner detection algorithms with sub-pixel refinement.

[0187] The corner detection module 1828 may identify candidate corner points for each workpiece instance within an occupied channel. The module may operate on segmentation outputs generated by the SAM model and edge maps derived from image gradients. In some embodiments, the module may employ Harris or Shi-Tomasi corner detection algorithms to locate high-curvature points indicative of corners. To improve localization accuracy, the module may apply sub-pixel refinement, estimating corner positions at a resolution finer than the pixel grid (e.g., computing fractional pixel coordinates such as (45.7, 102.3) rather than integer coordinates). Sub-pixel refinement may use interpolation of intensity gradients to refine corner positions beyond pixel-level accuracy, enabling precise dimension measurement. The output of the corner detection module may comprise a set of candidate corner points for each channel, which are subsequently processed by the corner selection module.

[0188] FIG. 20A and 20B show an example of a king-jack sub-assembly 1902 with candidate corners 1904a to 1904h, and an example of a header-sill sub-assembly 1912 with candidate corners 1914a to 1914h, respectively. The corner detection module 1828 may detect at least the candidate corners 1904a, 1904b, 1904c, 1904d, 1904e, 1904f , 1904g, and 1904h in sub-assembly 1902, and at least the candidate corners 1914a, 1914b, 1914c, 1914d, 1914e, 1914f, 1914g, and 1914h in subassembly 1912.

[0189] The corner selection module 1829 may select from among candidate corner points the correct corners for dimension measurement and assembly identification. The selection process may include employing a Hough-line detectionalgorithm to identify horizontal lines, then selecting the leftmost and rightmost corners near this line that are within a predefined pixel threshold (e.g., 10 pixels).

[0190] The Hough-line detection algorithm may scan the sub-assembly to detect all straight lines in the image of the sub-assembly. For example, for king-jack sub-assembly 1902, at least lines 1906a, 1906b, 1906c, 1906d, 1906e, 1906f , 1906g, and 1906h may be detected through Hough-line filtering as shown in FIG. 21 A. In another example, such as header-sill assembly 1912, at least lines 1916a, 1916b, 1916c, 1916d, 1916e, 1916f, 1916g, and 1916h may be detected through Hough-line filtering as shown in FIG. 21 B.

[0191] Out of these straight lines, one or more horizontal lines that are located closest to the top of the image may be chosen for corner selection, as shown in FIG. 22A and 22B. First, the line that is horizontal (or nearly so) and closest to the top of the image is chosen. This would be line 1906a in sub-assembly 1902, and line 1916a in sub-assembly 1912. The leftmost and rightmost detected corners that are within a predefined pixel threshold of this line are then selected. Therefore, corners 1904a and 1904b are selected for sub assembly 1902, and corners 1914c and 1914d are selected for sub-assembly 1912. The horizontal line that is second closest to the top of the image is also identified. This would be line 1906b for sub-assembly 1902, and line 1916b for sub-assembly 1912. Then, the leftmost and rightmost detected corners that are within a predefined pixel threshold of this line are then selected. Therefore, corners 1904d and 1904e are selected for sub assembly 1902, and corners 1914a and 1914b are selected for sub-assembly 1912.

[0192] To identify what kind of sub-assembly is captured in the image, the pairs of selected corners may be compared to each other. For example, because the top pair of selected corners (1914c and 1914d) in sub-assembly 1912 are closer to each other in pixel space than the other pair of selected corners (1914a and 1914b), the system is able to identify that sub-assembly 1912 is a header-sill sub-assembly. The system may also verify sill orientation by comparing corner spacing and validating that the sill is positioned on the correct side of the header according to expected layout specifications. For example, based on the fact that selected corners with narrower spacing (1914c and 1914d) are to the right of selected corners 1914a and 1914b, the system is able to identify that sub-assembly 1912 is in a right-sill orientation.

[0193] The mapping module 1830 may project selected corners from image pixel space to calibrated workpiece plane coordinates. The module may use a calibration square or pattern-based reference element placed on the workpiece plane to determine the transformation required for mapping. In some embodiments, the mapping module may compute a homography or perspective transformation based on detected calibration patterns, correcting for skew, rotation, and scale distortions. For assemblies with components on different planes (e.g., header-sill), the module may apply depth adjustments or stereo-based reprojection to normalize corner positions to a common reference plane. The output of the mapping module may comprise real- world coordinates for each selected corner, which may be used to compute measured characteristics such as length, width, and aggregate thickness.

[0194] The system may compute the width and length of each ply based on selected corner points projected onto the calibrated workpiece plane. For multi-ply sub-assemblies with parallel plies fastened together, such as king-jack assembly 1902, the width may be determined by calculating the distance between the selected corners (1904a and 1904b) on the top horizontal line of the first ply and between selected corners (1904d and 1904e) on the second horizontal line. To calculate the distance between the corners, the perpendicular distance between each corner and the channel edge may be computed. For example, in FIG. 23, the perpendicular distance 1901 between selected corner 1904e and a user-defined channel edge 1905 is indicated. This may be subtracted from the perpendicular distance 1903 between selected corner 1904d and channel edge 1905 to obtain the width of the jack stud. The length of each ply may be determined by measuring the distance from each pair of corners to the corresponding predetermined channel endpoints. For example, in FIG. 23, the distance 1907 from the top pair of selected corners 1904a and 1904b and the channel endpoint 1909 is indicated. This can be subtracted from the total length from one channel endpoint to another to obtain the length of the king stud. In some embodiments, the system may apply perspective correction to account for camera angle and ensure accurate real-world measurements. The measured dimensions may then be validated against expected values associated with the predefined location.

[0195] For header-sill assemblies, the system may compute width and length while accounting for multiple reference planes. The width may be determined bymeasuring the distance between the outermost corners among the selected corners (1914a and 1914d) after applying depth adjustments to normalize the header and sill planes. To calculate the distance between the corners, the perpendicular distance between each corner and the channel edge may be computed.

[0196] The depth adjustments may be based on the known distance between corners 1914b and 1914c. Once a header-sill sub-assembly type has been identified, the extent of the sill protrusion represented by the distance between the corners 1914b and 1914c can be inferred from the standardized depths of the header and the sill. For example, it may be known that the header has a depth of 3.5 inches, and the sill has a depth of 6 inches. Therefore, it is known that the sill will protrude from the back face of the header by 2.5 inches. In some embodiments, the system may use stereo vision or calibration-based reprojection to correct for elevation differences between the header and sill surfaces.

[0197] The length or height of the sub-assembly 1912 may be computed by measuring the distance between the inner corners (1914c and 1914d) and the channel endpoints.

[0198] Reference is next made to FIG. 18, which shows the example vision system 1600 of FIG. 16 in accordance with some alternative embodiments, operating in a physical location 1614 that contains multiple predefined areas 1610. While predefined areas 1610a and 1610b are explicitly shown, it will be appreciated that any number of pre-defined areas may exist in the physical location 1614. System 1600 may perform material verification and other computer vision processing tasks for some or all of the predefined areas, including areas 1610a, 1610b, and various other predefined areas that are not necessarily shown. In order to accomplish this, a plurality of imaging devices 1604 may be used. For example, one or more imaging devices 1604 may be provided for each predefined area. Alternatively, one or more imaging devices 1604 may cover multiple predefined areas. For example, an imaging device 1604 may be mounted on the end of a robot arm or a swiveling base. The imaging device 1604 may then be repositionable so as to be able to capture multiple views, including views of two or more predefined areas.

[0199] System 1600 may contain multiple user interface devices 1606. User interfaces devices 1606a and 1606b are shown in FIG. 18, which may correspond topredefined areas 1610a and 1610b, respectively. A user interface device 1606 may be provided for each predefined area 1610. The user interface device 1606 may display data specific to the predefined area. For example, a user interface device may display annotated images or alerts relating to the predefined area it is associated with, as produced by the vision system. In some embodiments, one user interface device may be provided for multiple predefined areas. In such cases, the user interface device 1606 may provide capability to switch between the different predefined areas. System 1600 may additionally contain one or more servers 1602. Even though only one server 1602 is shown in FIG. 18, it will be appreciated that any number of servers 1602 can be used and connected to the network 1612.

[0200] Reference is next made to FIG. 24, which shows a method 2400 for automated verification of multi-ply sub-assemblies located in a predefined area at a physical location. Method 2400 may help ensure that certain characteristics of the multi-ply sub-assembly, such as type, orientation, and dimension, used in the robotic assembly process at the predefined location is verified against information associated with a specific task. Additionally, method 2400 may facilitate the estimating of an accurate position of a sub-assembly for a pick-place task.

[0201] Method 2400 may be implemented by, for example, system 1600 of FIG. 16 and FIG. 18, on multi-ply assembly 1608 located in a predefined location 1610 at a physical location 1614.

[0202] The method begins at 2402 with acquiring, at one or more imaging devices in communication with a processor, one or more input images of the multi-ply sub-assembly at the predefined area. This step is analogous to step 502 of method 500.

[0203] The method proceeds to 2404 with performing, at the processor, background subtraction on the one or more input images to isolate regions corresponding to lumber material from background regions such as the shelf or cart surface. For example, server 1602 may operate a classification model 1823, as shown in FIG. 19, to classify each predefined channel as “occupied” or “unoccupied”. The classification model 1823 may be a support vector classification model. This step reduces computational load by limiting subsequent processing to occupied channels.

[0204] The method proceeds to 2406 with generating, for each occupied channel, multi-point prompts comprising at least one of a positive prompt and a negative prompt. Positive prompts correspond to regions where workpieces are likely to be present, and negative prompts correspond to regions where workpieces are likely absent. The prompts may be automatically generated based on known positions of datum pins defining channel boundaries that were input by the user at step 2404.

[0205] The method proceeds to 2408 with segmenting, at the processor, at least one input image based on a segmentation model to obtain one or more segmentation masks delineating the approximate contours of each workpiece within the occupied channel. Server 1602 may operate a SAM model 1823, as shown in FIG. 19, to generate segmentation masks for workpieces based on the multi-point prompts. The segmentation masks may include intersection points where edges meet, which serve as candidate corner locations for subsequent refinement.

[0206] The method proceeds to 2410 with detecting, at the processor, candidate corner points within each segmentation mask. Corner detection may be performed by the corner detection module 1828 of FIG. 19. In some embodiments, the module may employ Harris or Shi-Tomasi corner detection algorithms to locate high- curvature points indicative of corners. To improve localization accuracy, the module may apply sub-pixel refinement, estimating corner positions at a resolution finer than the pixel grid. The output of the corner detection module may comprise a set of candidate corner points for each channel. For example, referring to FIGS. 20A and 20B, at least candidate corner points 1904a-1904h may be detected for sub-assembly 1902, and at least candidate corner points 1914a-1914h may be detected for subassembly 1912.

[0207] The method proceeds to 2412 with selecting, at the processor, corners from among the candidate corner points. Corner selection may be performed by the corner selection module 1829 of FIG. 19. The selection process may include detecting horizontal lines via Hough-line filtering and selecting the leftmost and rightmost corners near these lines within a predefined pixel threshold.

[0208] The Hough-line detection algorithm may scan the sub-assembly to detect all straight lines in the image of the sub-assembly. For example, for king-jack sub-assembly 1902, at least lines 1906a, 1906b, 1906c, 1906d, 1906e, 1906f , 1906g,and 1906h may be detected through Hough-line filtering as shown in FIG. 21 A. In another example, such as header-sill assembly 1912, at least lines 1916a, 1916b, 1916c, 1916d, 1916e, 1916f, 1916g, and 1916h may be detected through Hough-line filtering as shown in FIG. 21 B.

[0209] Out of these straight lines, one or more horizontal lines that are located closest to the top of the image may be chosen for corner selection, as shown in FIG. 22A and 22B. First, the line that is horizontal (or nearly so) and closest to the top of the image is chosen. This would be line 1906a in sub-assembly 1902, and line 1916a in sub-assembly 1912. The leftmost and rightmost detected corners that are within a predefined pixel threshold of this line are then selected. Therefore, corners 1904a and 1904b are selected for sub assembly 1902, and corners 1914c and 1914d are selected for sub-assembly 1912. The horizontal line that is second closest to the top of the image is also identified, which would be line 1906b for sub-assembly 1902, and line 1916b for sub-assembly 1912. Then, the leftmost and rightmost detected corners that are within a predefined pixel threshold of this line are then selected. Therefore, corners 1904d and 1904e are selected for sub assembly 1902, and corners 1914a and 1914b are selected for sub-assembly 1912.

[0210] The method proceeds to 2414 with identifying, at the processor, the type of sub-assembly that has been captured. To identify what kind of sub-assembly is captured in the image, the pairs of selected corners may be compared to each other. For example, because the top pair of selected corners (1914c and 1914d) in subassembly 1912 are closer to each other in pixel space than the other pair of selected corners (1914a and 1914b), the system is able to identify that sub-assembly 1912 is a header-sill sub-assembly.

[0211] The method proceeds to 2416 with mapping, at the processor, the selected corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration. The mapping may be performed by mapping module 1830 of FIG. 19. The module may use a calibration square or pattern-based reference element placed on the workpiece plane to determine the transformation required for mapping. In some embodiments, the mapping module may compute a homography or perspective transformation based on detected calibration patterns, correcting for skew, rotation, and scale distortions. For assemblies with components on differentplanes (e.g., header-sill), the module may apply depth adjustments or stereo-based reprojection to normalize corner positions to a common reference plane. The output of the mapping module may comprise real-world coordinates for each selected corner, which may be used to compute measured characteristics such as length, width, and aggregate thickness.

[0212] The method proceeds to 2418 with computing, at the processor, one or more measured characteristics of the multi-ply sub-assembly based on the mapped corner coordinates and the identified type of sub-assembly. Characteristics may include length, width, and aggregate thickness.

[0213] For multi-ply sub-assemblies with parallel plies fastened together, such as king-jack assembly 1902, the width may be determined by calculating the distance between the selected corners (1904a and 1904b) of the first ply and between selected corners (1904d and 1904e) of the second ply. To calculate the distance between the corners, the perpendicular distance between each corner and the channel edge may be computed. For example, in FIG. 23, the perpendicular distance 1901 between selected corner 1904e and a channel edge 1905 is indicated. This may be subtracted from the perpendicular distance 1903 between selected corner 1904d and channel edge 1905 to obtain the width of the jack stud. The length of each ply may be determined by measuring the distance from each pair of corners to the corresponding predetermined channel endpoints. For example, in FIG. 23, the distance 1907 from the top pair of corners 1904a and 1904b and the channel endpoint 1909 is indicated. This can be subtracted from the total length from one channel endpoint to another to obtain the length of the king stud. In some embodiments, the system may apply perspective correction to account for camera angle and ensure accurate real-world measurements. The measured dimensions may then be validated against expected values associated with the predefined location.

[0214] For header-sill assemblies, the system may compute width and length while accounting for multiple reference planes. The width may be determined by measuring the distance between the outermost corners among the selected corners (1914a and 1914d) after applying depth adjustments to normalize the header and sill planes. To calculate the distance between the corners, the perpendicular distance between each corner and the channel edge may be computed.

[0215] The depth adjustments may be based on the known distance between corners 1914b and 1914c. Once a header-sill sub-assembly type has been identified, the extent of the sill protrusion represented by the distance between the corners 1914b and 1914c can be inferred from the standardized depths of the header and the sill. For example, it may be known that the header has a depth of 3.5 inches, and the sill has a depth of 6 inches. Therefore, it is known that the sill will protrude from the back face of the header by 2.5 inches. In some embodiments, the system may use stereo vision or calibration-based reprojection to correct for elevation differences between the header and sill surfaces.

[0216] The length or height of the sub-assembly 1912 may be computed by measuring the distance between the inner corners (1914c and 1914d) and the channel endpoints.

[0217] The method proceeds to 2420 with verifying, at the processor, the one or more measured characteristics of the sub-assembly with one or more expected characteristics associated with the predefined location, producing a verification indication. This step is analogous to step 516 of method 500.

[0218] The method proceeds to 2422 with outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface. This step is analogous to step 518 of method 500.

[0219] In some embodiments, a series of calibrations may be carried out before commencing method 2400. For example, this may include receiving, at the processor, user input defining the start and end points of each channel on the support structure. The channel endpoints may correspond to referencing members such as datum pins or plates. This information may be used to define channel boundaries for subsequent classification and segmentation steps. Other calibrations may include defining camera intrinsics and reference planes.

[0220] In some embodiments, step 2420 of method 2400 may involve determining whether the imaged material contains multiple plies, and, if so, determining the number of plies in the imaged material. For example, if the header section of the shelf is unoccupied, it can be assumed that either a single ply workpiece or a multi-ply assembly with parallel plies fastened together has been imaged. In thiscase, the processor may check whether at least two consecutive precut channels are occupied. The count of these occupied channels determines the number of plies. In determining the number of plies in the sub-assembly, it may be assumed that each individual ply in the multi-ply assembly is of the same predetermined width.

[0221] In some embodiments, method 2400 may include determining the orientation of the sub-assembly based on the selected corners. For example, the system may verify sill orientation by comparing corner spacing and validating that the sill is positioned on the correct side of the header according to expected layout specifications. For example, after identifying that the sub-assembly is a header-sill sub-assembly at step 2414, the system is able to identify that sub-assembly 1912 is in a right-sill orientation based on the fact that the top selected corners with narrower spacing (1914c and 1914d) are to the right of selected corners 1914a and 1914b.

[0222] While the above description describes features of example embodiments, it will be appreciated that some features and / or functions of the described embodiments are susceptible to modification without departing from the spirit and principles of operation of the described embodiments. For example, the various characteristics which are described by means of the represented embodiments or examples may be selectively combined with each other. Accordingly, what has been described above is intended to be illustrative of the claimed concept and non-limiting. It will be understood by persons skilled in the art that other variants and modifications may be made without departing from the scope of the invention as defined in the claims appended hereto. The scope of the claims should not be limited by the preferred embodiments and examples, but should be given the broadest interpretation consistent with the description as a whole.

Claims

CLAIMS:1 . A method for automated verification of a plurality of workpieces, the plurality of workpieces being located in a predefined area at a physical location and comprising workpieces of differing dimensions, wherein at least some of the plurality of workpieces are in a vertical orientation, comprising: acquiring, at one or more imaging devices in communication with a processor, one or more input images of the plurality of workpieces at the predefined area; segmenting, at the processor, at least one input image of the one or more input images based on a semantic segmentation model to generate at least one corresponding semantically segmented image, the at least one semantically segmented image comprising at least one visible segment and at least one masked segment; identifying, at the processor, one or more identified workpiece instances from the at least one input image based on an instance segmentation model, the at least one semantically segmented image and one or more context inputs; generating, at the processor, at least one corresponding instance segmented image based on the one or more identified workpiece instances; determining, at the processor, based on an edge detection model, one or more edges for the one or more identified workpiece instances for the at least one corresponding instance image; mapping, at the processor, each of the one or more edges and the one or more corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration; determining, at the processor, one or more measured characteristics of each workpiece instance based at least on the one or more sets of coordinates; verifying, at the processor, for each workpiece instance, the one or more measured characteristics of the workpiece instance with one or more expected characteristics associated with the predefined location, producing a verification indication; and outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface.

2. The method of claim 1 , wherein the plurality of workpieces is arranged on a support structure.

3. The method of claim 1 or 2, wherein the support structure comprises a plurality of referencing members for aligning the plurality of workpieces with respect to one another.

4. The method of claim 2 or 3 wherein a subset of referencing members from the plurality of referencing members cooperate to define at least one channel for receiving at least one workpiece of the plurality of workpieces.

5. The method of any one of claims 2 to 4, wherein the referencing members comprise at least one of: a pin protruding out of a surface of the support structure; and a plate arranged orthogonally to the surface of the support structure.

6. The method of any one of claims 2 to 5, wherein the support structure comprises a wall, a shelf, or a material cart.

7. The method of any one of claims 1 to 6, wherein the workpiece plane calibration comprises a calibration board placed on the workpiece plane, the calibration board comprising a pattern readable by the processor to determine at least one of a relative position and a relative orientation8. The method of any one of claims 1 to 7, wherein the context input comprises at least one of a positive prompt and a negative prompt, wherein the positive prompt corresponds to a region where workpieces are likely to be present and the negative prompt corresponds to a region where workpieces are likely to be absent.

9. The method of any one of claims 1 to 8, wherein the context input is predetermined for each predefined area based on characteristics of the imaging device.

10. The method of any one of claims 1 to 9, wherein the one or more measured characteristics comprises one or more of a length, a width, a height, a position, and a material type.

11. The method of any one of claims 1 to 10, wherein the identified workpieces instances comprise a lumber instance.

12. The method of any one of claims 1 to 11 , wherein the one or more imaging devices comprises at least one monocular camera.

13. The method of any one of claims 1 to 11 , wherein the one or more imaging devices comprises at least one stereoscopic camera.

14. The method of any one of claims 1 to 13, wherein the one or more imaging devices comprises at least one monocular camera and at least one stereoscopic camera.

15. The method of any one of claims 1 to 14, wherein the instance segmentation model is fine-tuned on one or more training materials, the one or more training materials associated with a material type of at least one workpiece of the plurality of workpieces.

16. The method of any one of claims 1 to 15, wherein: the support structure is colored with one or more pre-selected colors; and the segmenting the at least one input image is further based on the one or more pre-selected colors.

17. The method of any one of claims 1 to 16, further comprising pre-processing the at least one input image by adjusting at least one brightness characteristic and at least one darkness characteristic of the at least one input image.

18. The method of any one of claims 1 to 17, wherein the semantic segmentation model comprises a convolutional neural network.

19. The method of any one of claims 1 to 18, wherein the instance segmentation model comprises a vision transformer.

20. A system for automated verification of a plurality of workpieces, comprising:one or more imaging devices in communication with a processor, configured to acquire one or more input images of the plurality of workpieces at the predefined area and transmit the one or more input images to the processor; a user interface in communication with the processor, configured to receive a verification indication from the processor and output the verification indication on a display of the user interface; and the processor, configured to: i) receive the one or more input images from the one or more imaging devices; ii) segment at least one input image of the one or more input images based on a semantic segmentation model; iii) generate a corresponding semantically segmented image based on the segmenting, the semantically segmented image comprising at least one visible segment and at least one hidden segment; iv) identify one or more identified workpiece instances from the at least one input image based on an instance segmentation model, the segment image and one or more context inputs; v) generate at least one corresponding instance image based on the one or more identified workpiece instances; vi) determine, based on an edge detection model, one or more edges and one or more corners for the one or more identified workpiece instances for the at least one corresponding instance image; vii) map each of the one or more edges and the one or more corners to one or more sets of coordinates on a reference plane based on a reference plane calibration; viii) determine one or more measured characteristics of each workpiece instance based on the one or more sets of coordinates; and ix) verify, for each workpiece instance, the one or more measured characteristics of the workpiece instance with one ormore expected characteristics associated with the predefined location, producing the verification indication. wherein: i) the plurality of workpieces is located in a predefined area at a physical location; ii) the plurality of workpieces comprises workpieces of differing dimensions; and iii) at least some of the plurality of workpieces are in a vertical orientation.

21. The system of claim 20, wherein the plurality of workpieces is arranged on a support structure.

22. The system of claim 21 , wherein the support structure comprises a plurality of referencing members for aligning the plurality of workpieces with respect to one another.

23. The system of claim 21 or 22 wherein a subset of referencing members from the plurality of referencing members cooperate to define at least one channel for receiving at least one workpiece of the plurality of workpieces.

24. The system of any one of claims 21 to 23, wherein the referencing members comprise at least one of: a pin protruding out of a surface of the support structure; and a plate arranged orthogonally to the surface of the support structure.

25. The system of any one of claims 21 to 24, wherein the support structure comprises a wall, a shelf, or a material cart.

26. The system of any one of claims 20 to 25, wherein the workpiece plane calibration comprises a calibration board placed on the workpiece plane, the calibration board comprising a pattern readable by the processor to determine at least one of a relative position and a relative orientation.

27. The system of any one of claims 20 to 26, wherein the context input comprises at least one of a positive prompt and a negative prompt, wherein the positive promptcorresponds to a region where workpieces are likely to be present and the negative prompt corresponds to a region where workpieces are likely to be absent.

28. The system of any one of claims 20 to 27, wherein the context input is predetermined for each predefined area based on characteristics of the imaging device.

29. The system of any one of claims 20 to 28, wherein the one or more measured characteristics comprises one or more of a length, a width, a height, a position, and a material type.

30. The system of any one of claims 20 to 29, wherein the identified workpieces instances comprise a lumber instance.31 . The system of any one of claims 20 to 30, wherein the one or more imaging devices comprises at least one monocular camera.

32. The system of any one of claims 20 to 31 , wherein the one or more imaging devices comprises at least one stereoscopic camera.

33. The system of any one of claims 20 to 31 , wherein the one or more imaging devices comprises at least one monocular camera and at least one stereoscopic camera.

34. The system of any one of claims 20 to 33, wherein the instance segmentation model is fine-tuned on one or more training materials, the one or more training materials associated with a material type of at least one workpiece of the plurality of workpieces.

35. The system of any one of claims 20 to 34, wherein: the support structure is colored with one or more pre-selected colors; and the segmenting the at least one input image is further based on the one or more pre-selected colors.

36. The system of any one of claims 20 to 35, wherein the processor is further configured to pre-process the at least one input image by adjusting at least onebrightness characteristic and at least one darkness characteristic of the at least one input image.

37. The system of any one of claims 20 to 36, wherein the semantic segmentation model comprises a convolutional neural network.

38. The system of any one of claims 20 to 37, wherein the instance segmentation model comprises a vision transformer.

39. A method for automated verification of a multi-ply sub-assembly, the multi-ply sub-assembly being located in a predefined area housing a plurality of channels at a physical location and comprising multiple workpieces, comprising: acquiring, at one or more imaging devices in communication with a processor, one or more input images of the multi-ply sub-assembly at the predefined area housing the plurality of channels; classifying, at the processor, for at least one input image of the one or more input images, each of the plurality of channels as occupied or unoccupied based on a classification model, to identify a plurality of occupied channels; generating, at the processor, for each occupied channel in the plurality of occupied channels, a multi-point prompt comprising at least one of a positive prompt and a negative prompt; segmenting, at the processor, the at least one input image of the one or more input images with a segmentation model based on the multi-point prompt to obtain one or more segmentation masks of the multi-ply sub-assembly within each occupied channel; detecting, at the processor, candidate corner points within each segmentation mask with a corner detection algorithm; selecting, at the processor, corners from among the candidate corner points to identify a plurality of selected corners based on a line detection algorithm; identifying, at the processor, a type of the multi-ply sub-assembly based on the selected corners; mapping, at the processor, the selected corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration to obtain mapped corner coordinates;computing, at the processor, one or more measured characteristics of the multi-ply sub-assembly based on the mapped corner coordinates; verifying, at the processor, the one or more measured characteristics of the multi-ply sub-assembly with one or more expected characteristics associated with the predefined location, producing a verification indication; and outputting, at a user interface in communication with the processor, the verification indication on a display of the user interface.

40. The method of claim 39, wherein the classification model is a support vector classification model.41 . The method of claim 39 or 40, wherein the corner detection algorithm is a Harris or Shi-Tomasi corner detection algorithm.

42. The method of any one of claims 39 to 41 , wherein the line detection algorithm is a Hough transform algorithm.

43. The method of any one of claims 39 to 42, further comprising identifying and verifying an orientation of the multi-ply assembly based on the plurality of selected corners.

44. The method of any one of claims 39 to 43, further comprising determining a number of plies in the multi-ply sub-assembly based on a count of the plurality of occupied channels.

45. The method of any one of claims 39 to 44, wherein the type of the multi-ply subassembly is selected from a group comprising: header-sill, king-jack stud, posts, and beam-pocket.

46. A system for automated verification of a multi-ply sub-assembly, comprising: one or more imaging devices in communication with a processor, configured to acquire one or more input images of the multi-ply sub-assembly at a predefined area housing a plurality of channels and transmit the one or more input images to the processor;a user interface in communication with the processor, configured to receive a verification indication from the processor and output the verification indication on a display of the user interface; and the processor, configured to: i) receive the one or more input images from the one or more imaging devices; ii) classify, for at least one input image of the one or more input images, each of the plurality of channels as occupied or unoccupied based on a classification model, to identify a plurality of occupied channels; iii) generate, for each occupied channel in the plurality of occupied channels, a multi-point prompt comprising at least of a positive prompt and a negative prompt; iv) segment the at least one input image of the one or more input images with a segmentation model based on the multi-point prompt to obtain one or more segmentation masks of the multiply sub-assembly within each occupied channel; v) detect candidate corner points within each segmentation mask with a corner detection algorithm; vi) select corners from among the candidate corner points to identify a plurality of selected corners based on a line detection algorithm; vii) identify a type of the multi-ply sub-assembly based on the selected corners; viii) map the selected corners to one or more sets of coordinates on a workpiece plane based on a workpiece plane calibration to obtain mapped corner coordinates; ix) compute one or more measured characteristics of the multi-ply sub-assembly based on the mapped corner coordinates; and x) verify the one or more measured characteristics of the multi-ply sub-assembly with one or more expected characteristicsassociated with the predefined location, producing a verification indication.

47. The system of claim 46, wherein the classification model is a support vector classification model.

48. The system of claim 46 or 47, wherein the corner detection algorithm is a Harris or Shi-Tomasi corner detection algorithm.

49. The system of any one of claims 46 to 48, wherein the line detection algorithm is a Hough transform algorithm.

50. The system of any one of claims 46 to 49, wherein the type of the multi-ply sub- assembly is selected from a group comprising: header-sill, king-jack stud, posts, and beam-pocket.51 . The system of any one of claims 46 to 50, wherein the processor is further configured to identify and verify an orientation of the multi-ply assembly based on the plurality of selected corners.

52. The system of any one of claims 46 to 51 , further comprising determining a number of plies in the multi-ply sub-assembly based on a count of the plurality of occupied channels.