Grounded observer model system and method

WO2026136889A1PCT designated stage Publication Date: 2026-06-25YALE UNIVERSITY

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: YALE UNIVERSITY
Filing Date: 2025-12-19
Publication Date: 2026-06-25

Application Information

Patent Timeline

19 Dec 2025

Application

25 Jun 2026

Publication

WO2026136889A1

IPC: G06F40/40; G06F40/35; G06N3/09; G10L15/183; G10L15/22

AI Tagging

Application Domain

Natural language translation Biological models

Technology Topics

Dialog systemSentiment score

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Hierarchical language model-based task oriented dialogue system
US20260187110A1Dialog systemLinguistic model
Training question and answer dialog systems to avoid adversarial attacks
CN116324804BDigital data protection Platform integrity maintainanceDialog systemAttack
Question recommendation method, device, and storage medium
CN122364444ADialog systemEngineering
system
JP2026104369ACosmonautic condition simulations Data processing applicationsDialog systemProcessing
Systems and methods for embodied multimodal artificial intelligence question answering and dialogue with commonsense knowledge
US12639522B2Programme-controlled manipulator Autonomous decision making processDialog systemCommonsense knowledge

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Foundation models lack robust guardrails to ensure socially acceptable and contextually appropriate behavior, particularly in sensitive domains like healthcare and finance, where traditional techniques for constraining agent behavior, such as prompt engineering and reinforcement learning, fail to provide reliable and adaptable constraints.

Method used

A grounded observer framework that uses a base large language model with an observer model to evaluate responses through sentiment, coherence, and formality analysis, providing feedback to refine responses and ensure they meet predefined thresholds, and a buffer system to enforce these constraints dynamically.

Benefits of technology

The framework enhances the reliability and adaptability of foundation models by ensuring responses are contextually appropriate, reducing the risk of missteps and improving user interactions in sensitive domains.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure US2025060670_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A computer-implemented conversation system includes a base large language model configured to receive a prompt from a user and provide a base response, an observer large language model configured to evaluate the base response and provide a feedback prompt to the base large language model to produce a refined response, a sentiment analysis module configured to calculate a sentiment score of the refined response, a coherence analysis module configured to calculate a coherence score of the refined response, a formality analysis module configured to calculate a formality score of the refined response, and a controller configured to present the refined response to the user when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold. A conversation system and a grounded observer system are also disclosed.

Need to check novelty before this filing date? Find Prior Art

Description

Atorney docket # 047162-5394-00WOGROUNDED OBSERVER MODEL SYSTEM AND METHODCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional Application No. 63 / 737,306 filed on December 20, 2024, incorporated herein by reference in its entirety.BACKGROUND OF THE INVENTION

[0002] Foundation models are rapidly being integrated into various fields, from medical diagnostics and financial predictions to socially sensitive areas such as education, mental healthcare, and support for individuals with disabilities.

[0003] As foundation models increasingly permeate sensitive domains such as healthcare, finance, and mental health, ensuring their behavior meets desired outcomes and social expectations becomes critical. In fields where accuracy and reliability are paramount, such as healthcare and finance, the consequences of errors of foundation models can be severe. Yet, in socially sensitive domains, where the parameters of success are less tangible, the impact of missteps can be as profound. For example, a system intended to provide calming techniques in a clinic waiting room could exacerbate anxiety if it delivers generic or poorly timed suggestions. If it fails to recognize the urgency or context of a patient’s distress, it may offer advice that feels dismissive or irrelevant, potentially increasing the patient’s anxiety. Traditional techniques for constraining agent behavior, which typically rely on low-dimensional, discrete state and action spaces, cannot be directly applied to these high-dimensional models.

[0004] There remains a need in the art for robust guardrails for foundation models to protect users and the system’s integrity.Attorney docket # 047162-5394-00WOSUMMARY OF THE INVENTION

[0005] In one aspect, a computer-implemented conversation system comprises a base large language model configured to receive a prompt from a user and provide a base response, an observer large language model configured to evaluate the base response and provide a feedback prompt to the base large language model to produce a refined response, a sentiment analysis module configured to calculate a sentiment score of the refined response, a coherence analysis module configured to calculate a coherence score of the refined response, a formality analysis module configured to calculate a formality score of the refined response, and a controller configured to present the refined response to the user when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold.

[0006] In some embodiments, the observer large language model is configured to provide the feedback prompt to summarize the base response when the base response exceeds a response length threshold. In some embodiments, the response length threshold is between 20 and 150 tokens. In some embodiments, the sentiment analysis module performs a VADER sentiment analysis. In some embodiments, the sentiment analysis score is between -1 and 1. In some embodiments, the coherence analysis module is configured to perform a BERT coherence analysis. In some embodiments, the coherence analysis module is configured to encode the refined response to a sequence of tokens, derive a set of embeddings of the sequence of tokens, and calculate an entropy of the token embeddings.

[0007] In some embodiments, the formality analysis module is configured to calculate a cosine similarity of the refined response to a set of assistance keywords. In some embodiments, the set of assistance keywords comprises at least one of “help,” “assist,” “information,” “support,” “guide,” “instruction,” “advice,” “solve,” “fix,” “clarify,” “resolve,” “solution”, “answer”, and “explain.”

[0008] In one aspect, a conversation system comprises a robot comprising an audio input, an audio output, and a processor communicatively connected to the audio input and the audio output, a non-transitory computer-readable medium communicatively connected to the processor with instructions stored thereon, which when executed by the processor, perform stepsAtorney docket # 047162-5394-00WO comprising recording an audio stream from the audio input, transcribing the audio stream to a text prompt, providing the text prompt to a base large language model to produce a base response, evaluating the base response with an observer large language model to provide a feedback prompt to the base large language model and produce a refined response, calculating a sentiment score of the refined response, calculating a coherence score of the refined response, calculating a formality score of the refined response, approving the refined response when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold, synthesizing an audio response from the approved refined response, and transmitting the audio response via the audio output.

[0009] In some embodiments, the robot further comprises an image acquisition device, and wherein the instructions further comprise the step of calculating person detection and gaze detection on a user of the conversation system. In some embodiments, the step of evaluating the base response with the observer large language model comprises providing the feedback prompt to summarize the response to the base large language model when a length of the base response exceeds a threshold. In some embodiments, the system further comprising the step of providing a second feedback prompt to the base large language model to restate the base response more positively when the sentiment score is below the sentiment threshold.

[0010] In some embodiments, the instructions further comprise the step of providing a second feedback prompt to the base large language model to restate the base response more casually when the formality score of the base response exceeds the formality threshold. In some embodiments, the instructions further comprise the step of providing a second feedback prompt to the base large language model to restate the base response more clearly when the coherence score of the base response is below the coherence threshold. In some embodiments, the base large language model and the observer large language model use a same architecture.

[0011] In one aspect, a grounded observer system comprises a base machine learning model configured to receive a base prompt and produce a base response, a feature extractor configured to extract contextual features from the base response, an overlay rules module having a set of overlay rules to apply to the contextual features, each overlay rule having a rigidity measure, the overlay rules module configured to produce a set of overlay descriptors, and a buffer comprisingAtorney docket # 047162-5394-00WO an observer machine learning model, configured to receive the overlay descriptors and the base response, and provide a feedback directive to the LLM based on the base response and the overlay descriptors, wherein the feedback directive is configured to alter the base response to better conform to at least one of the overlay rules.

[0012] In some embodiments, when the rigidity measure is strict, the corresponding overlay rule is strictly applied, and wherein when the rigidity measure is lax, the corresponding overlay rule is flexibly applied. In some embodiments, the observer machine learning model and the base machine learning model are large language models. In some embodiments, at least one of the overlay rules comprises an “if this, then that” rule.BRIEF DESCRIPTION OF THE DRAWINGS

[0013] The foregoing purposes and features, as well as other purposes and features, will become apparent with reference to the description and accompanying figures below, which are included to provide an understanding of the invention and constitute a part of the specification, in which like numerals represent like elements.

[0014] FIG. 1A depicts a schematic of an exemplary system comprising a grounded observer monitoring a base model’s behavior to ensure responses adhere to overlay constraints according to aspects of the present disclosure.

[0015] FIG. IB depicts a schematic of an exemplary computer-implemented conversation system according to aspects of the present disclosure.

[0016] FIG. 1C depicts a schematic of an exemplary computer-implemented grounded observer system according to aspects of the present disclosure.

[0017] FIG. ID depicts a schematic of an exemplary user interface for the disclosed system according to aspects of the present disclosure.

[0018] FIG. 2 depicts the evaluation scores of Large Language Models (LLMs). The depicted graph reflects the similarity of the model’s small talk to that of the participants, scored from 0 (no differences between human and model responses) to 4 (highest absolute difference).Atorney docket # 047162-5394-00WO

[0019] FIG. 3 depicts the evaluation of observer v. base responses. The depicted graph reflects the similarity of the models’ small talk to that of its human users during text-based, chatbot interaction. Scores range from 0 (no difference) to 4 (highest absolute difference).

[0020] FIG. 4 depicts the observer-enabled robot engaged in naturalistic, small talk with users, fostered rapport, enhanced user comfort, and created more seamless interactions.

[0021] FIG. 5 depicts an exemplary computing environment in accordance with some of the embodiments.DETAILED DESCRIPTION

[0022] It is to be understood that the figures and descriptions of the present invention have been simplified to illustrate elements that are relevant for a clear understanding of the present invention, while eliminating, for the purpose of clarity, many other elements found in related systems and methods. Those of ordinary skill in the art may recognize that other elements and / or steps are desirable and / or required in implementing the present invention. However, because such elements and steps are well known in the art, and because they do not facilitate a better understanding of the present invention, a discussion of such elements and steps is not provided herein. The disclosure herein is directed to all such variations and modifications to such elements and methods known to those skilled in the art.

[0023] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, exemplary methods and materials are described.

[0024] As used herein, each of the following terms has the meaning associated with it in this section.Atorney docket # 047162-5394-00WO

[0025] The articles “a” and “an” are used herein to refer to one or to more than one (z.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.

[0026] “About” as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of ±20%, ±10%, ±5%, ±1%, and ±0.1% from the specified value, as such variations are appropriate.

[0027] Throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, 6 and any whole and partial increments therebetween. This applies regardless of the breadth of the range.Related Work

[0028] Given their complexity and the vast datasets they are trained on, ensuring that base models behave in predictable and socially acceptable ways is a significant challenge.Researchers have explored approaches to impose constraints on large language models (LLMs), each with strengths and limitations.Prompt Engineering

[0029] The current standard for constraining model behavior is having a good prompt. While crafting specific input prompts has shown promise in many applications (Giray 2023; Mesko 2023; White et al. 2023), it has significant limitations when it comes to robustly constraining agent behaviors, especially in complex, dynamic, and sensitive contexts.

[0030] Lack of Robustness. One of the primary limitations of prompt engineering is its lack of robustness. While specific prompts can guide the model in controlled scenarios, they often fail to generalize across different contexts and variations. A prompt or modification to a prompt thatAtorney docket # 047162-5394-00WO works well in one situation might produce unexpected or undesirable results in another, leading to inconsistent behavior (Zhou et al. 2022; Huang et al. 2024).

[0031] Context Sensitivity. Foundation models are highly sensitive to the context provided by prompts. Small changes in phrasing can lead to significantly different outputs, making it challenging to predict and control the model’s behavior reliably (Denny, Kumar, and Giacaman 2023; Dong et al. 2024). This sensitivity can be particularly problematic in dynamic environments where the context is continuously changing.

[0032] Inability to Enforce Hard Constraints. Prompt engineering cannot enforce hard constraints on model behavior. While prompts can suggest or guide the model toward certain behaviors, they cannot guarantee that it will always comply with these suggestions (Niknazar et al. 2024). This limitation is critical in applications where strict adherence to ethical guidelines or safety protocols is necessary.

[0033] Translating to Real-World Behavior. Many real-world scenarios involve ambiguous and complex situations that are difficult to capture with prompts (Leite, Martinho, and Paiva 2013). For example, ensuring that an LLM provides appropriate mental health support requires understanding and responding to nuanced emotional cues, which cannot be fully encapsulated in a prompt. In such cases, prompt engineering alone cannot ensure reliable and sensitive behavior.

[0034] Temporal Constraints. Prompt engineering does not inherently support temporal constraints, where the desired behavior depends on the sequence and timing of interactions (Lyu et al. 2024; Chen and Huang 2023). For example, maintaining consistent behavior over multiple exchanges with a user is challenging to achieve through prompt design alone.Constrained Reinforcement Learning

[0035] Constrained reinforcement learning (CRL) enhances traditional reinforcement learning (RL) by integrating predefined constraints to ensure agents operate within specific safety, ethical, or operational boundaries. While traditional RL focuses solely on maximizing cumulative rewards, CRL incorporates additional constraints as hard limits (e g., avoiding unsafe actions) or soft constraints (e.g., minimizing deviation from desired behaviors). CRL incorporates inductive biases through logical rules that govern the agent’s behavior, applying these constraints directlyAttorney docket # 047162-5394-00WO to states and actions or modifying the reward function to align with the defined limits (Gu et al. 2022).

[0036] A notable approach within CRL is shielded RL, which employs user-defined policy overrides, or “shields,” to restrict certain actions based on specific conditions, thereby minimally disrupting the RL model while enforcing desired behaviors (Garcia and Fernandez 2015). However, shielded RL typically relies on a dynamic model and repairing existing policies rather than adapting to evolving preferences. In contexts such as personalized healthcare or companionship, a flexible approach to adapt policies to meet context-specific needs in real-time is more suitable.Transparent Matrix Overlays

[0037] Transparent Matrix Overlays (TMOs) is a promising technique for real-time modification of agent behavior by integrating user directives as symbolic constraints on a robot’s policy (Brawer et al. 2023). This approach merges concepts from CRL and shielded RL, leveraging symbolic reasoning to enhance flexibility in behavioral adaptation.

[0038] Demonstrated through a simulated collaborative cooking task (Brawer et al. 2023), TMOs allowed adjustments to a robot’s policy without requiring extensive retraining. By applying logical rules and user-specific directives as temporary constraints, TMOs facilitate immediate changes in behavior to align with evolving user preferences. This method contrasts with traditional CRL techniques, which often require substantial retraining to incorporate new constraints, and shielded RL methods that focus on policy repairs rather than accommodating real-time preference changes. TMOs balance the stability of learned behaviors with the flexibility required to meet new and evolving preferences, making them a valuable tool for interactive systems.

[0039] One limitation of TMOs is the reliance on hand-crafted predicates and classifiers. In the current implementation, these elements are manually designed to define constraints and directives. While this method works within controlled environments, it constrains the flexibility of the TMO approach. The assumptions of having a relatively simple, discrete state space, deterministic actions, and non-parallel task completion further simplify the scenario. Real-worldAtorney docket # 047162-5394-00WO applications often involve more dynamic and complex environments where these assumptions may not hold.State and Action Space Abstraction

[0040] Most action selection mechanisms, like TMOs, assume a known, discrete, or discretized state space with well-defined actions. However, in interactions with LLMs (LLMs), an action selection mechanism must handle continuous and possibly infinite state spaces where iterating through all possible actions or states may be impractical (Paul 2024). This requires rules that can overlay abstracted state representations or symbolic predicates to approximate the agent’s internal state and action space. Instead of exhaustively evaluating every action, the agent can use these overlays to focus on a manageable subset of candidate actions or employ probabilistic sampling techniques within the space emphasized by the overlays. Furthermore, such abstraction must supersede differences in how proprietary architectures handle context, manage memory, and generate responses (Naveed et al. 2023).A Grounded Observer Framework

[0041] Social behavior is inherently emergent and complex. However, in many cases, appropriate behavior can be guided by simple rules. Just as TMOs embed rules to control behavior, similar principles can be applied to ensure that base models exhibit appropriate social behavior. Base models are analogous to the action policies generated — they are statistical models that are expensive to generate, difficult to dissect, and opaque to inspection. By imposing transparent and adaptive constraints, these models can be managed and directed to align with desired outcomes in socially sensitive domains. This can be achieved by evaluating a model’s output through context-based rules and providing feedback to guide the model toward more appropriate behaviors.

[0042] Aspects of the invention relate to a software-implemented conversation system, specifically a LLM or chatbot software specifically configured for informal conversation about unimportant or uncontroversial matters, sometimes referred to as “small talk.” The conversation system and methods thereof may in some embodiments be referred to herein as a “grounded observer framework.”Attorney docket # 047162-5394-00WO

[0043] Given the complexities of high-dimensional models, traditional techniques for constraining agent behavior, which typically rely on low-dimensional, discrete state and action spaces, cannot be directly applied. Drawing inspiration from robotic action selection techniques, the disclosed grounded observer framework constrains foundation model behavior and offers both behavioral guarantees and real-time variability. This method leverages real-time assessment of low-level behavioral characteristics to dynamically adjust model actions and provide contextual feedback. In some embodiments, the disclosed system may be applied to a robot for novel, unscripted audio, visual, and / or text-based interactions with humans.

[0044] Despite being aware of the inherent risks of Al hallucinations, misinformation, and bias, a recent large-scale global study revealed that 66% of respondents are still willing to use this nascent technology in sensitive areas such as personal advice and relationship counseling (Capgemini 2023). This paradox highlights the immense potential benefits of these models in addressing societal challenges while also underscoring the current concerns. A significant issue tempers the widespread adoption of these tools: the lack of comprehensive guardrails to prevent undesired behavior and ensure reliable outcomes.

[0045] Designing usable systems that impose limits on foundation models involves two key challenges. First, foundation models are based on statistical learning from vast datasets, making their internal mechanisms complex and opaque. Traditional rule-based systems use symbolic representations, which are formal and interpretable but not directly compatible with the statistical nature of foundation models. This difficulty is compounded when integrating symbolic rulebased systems that map human concepts into precise rules, a challenge akin to reconciling statistical learning mechanisms with symbolic representation systems. While neurosymbolic approaches that aim to blend statistical and symbolic methods are being explored (e.g., Garcez, A. d. and Lamb, L. C. 2023), effective integration remains an open area of research.

[0046] Second, foundation models must be able to adapt their behavior in real-time to the unique needs and contexts of individual users (Wang et al. 2023; Chen et al. 2024). Static, predefined rules often do not address the dynamic and nuanced nature of personal interactions (Raman et al. 2022). For instance, an LLM for mental health support must respond appropriately to a user’s current emotional state and context. A static rule-based approach may fail to provide suitableAtorney docket # 047162-5394-00WO support during a crisis or tailor interactions based on ongoing conversations, highlighting the need for real-time adaptability to meet individual user needs.

[0047] These two challenges are not unique to foundation models but manifest in other areas, such as robotics. In action selection for robot systems, an agent must decide which actions to take, often using large-scale statistical models, while adhering to user-specified rules, such as “don’t touch the stove.” Addressing this involves techniques known as shielding (Alshiekh et al. 2018) and interactive policy shaping (Griffith et al. 2013). Shielding techniques prevent particular actions from being executed, effectively restricting the robot’s behavior, while interactive policy shaping modifies the action selection policy in real time based on user input or situational changes. These approaches aim to reconcile the flexibility of statistical models with the necessity of adhering to predefined constraints (Biza et al. 2021), reflecting similar challenges faced in the context of foundation models.

[0048] Drawing inspiration from robotic action selection techniques, some embodiments of the disclosed system comprises a framework for constraining foundation model behavior that offers both behavioral guarantees and real-time variability. In some embodiments, the system comprises a grounded observer model that continuously assesses candidate actions of a base model based on low-level behavioral characteristics, makes dynamic adjustments to the base model action generation, and provides feedback directives or prompts to ensure the behavior of the base model remains contextually appropriate and effective.

[0049] In one example, the disclosed system comprises a framework of a grounded observer, applied to build agents capable of small talk, a task that requires nuanced social sensitivity to ensure continued appropriateness and relevance. The disclosed system demonstrates how the grounded observer can impose precise constraints on LLM behavior in highly subjective contexts and challenge the typically informative and assistive nature of these models. It was also demonstrated that this method leads to more positive and socially appropriate interactions when integrated into a robot where it amplifies social impacts. Although in certain embodiments, the disclosed system is used to improve the quality of small talk in interactions with a LLM or chatbot, it is understood that the system and methods disclosed herein can also be used to create guidelines in various other socially sensitive domains, including but not limited to academic tutoring agents, where the observer may be configured to ensure constructive and contextuallyAtorney docket # 047162-5394-00WO appropriate feedback to students; therapeutic Al systems, where the observer may be configured to mitigate risks of generating insensitive or harmful responses during mental health support; Al companions for older adults, where the observer may be configured to ensure emotionally aware and comforting interactions; customer support systems, where the observer may be configured to maintain a professional tone and customer satisfaction; and automated moderation systems for online platforms, where the observer may be configured to enforce clear, empathetic, and culturally sensitive communication when addressing rule violations or disputes.

[0050] With reference to FIG. 1A, an overview of an exemplary system 100 (e.g., conversation system) disclosed herein is shown. The base model 101 is provided with a base prompt 102, which generates actions in response to environmental or user inputs (e.g. sensor data, user prompts). Depending on the type of model, these actions can take the form of text, images, videos, audio streams, structured data such as JSON or XML, programmatic code, robotic control signals, 3D object representations, augmented reality overlays, or multimodal outputs combining different formats (e.g., text and visual feedback), or other any other suitable output or combination of outputs. In some embodiments, the base model is an LLM, and both the model’s inputs and outputs are in text form, though other modalities are also applicable. To evaluate the base model’s actions, feature extractors 103 convert these actions and the surrounding context into contextual or numerical features 106. These features 106 can then be analyzed as scores based on the characteristics for which evaluation is desired.

[0051] Depending on the scenario, these extractors 103 may also incorporate inputs from high- level planners or context observers. For example, a feature extractor 103 could be designed to quantify the politeness of the model’s 101 text output.

[0052] These contextual features are evaluated against IFTTT (If This, Then That) rules 105, which function as overlays on the model’s 101 actions. The overlay rules 105 may be thought of as semi-transparent sheets on an overhead projector: one can stack, prioritize, or remove them to adjust the view without altering the original image, which in this context is the base response of the base model 101. Similarly, these rules 105 can be adjusted without extensive changes to the base model 101.Atorney docket # 047162-5394-00WO

[0053] High-level descriptors — summaries of how well proposed actions align with the overlays — may be provided by each overlay rule for example in a fixed text structure. These descriptors pinpoint areas where proposed actions comply with or deviate from the established rules. For instance, a rule about politeness might provide a directive like “tone is too polite,” while a rule that assesses user frustration could direct the model to include more empathetic language. Each overlay may produce only a binary yes / no response, or may additionally or alternatively produce a score indicating the degree of deviation from the rule 105. These scores may highlight more severe rule violations by using methods such as ranking or incorporating keywords like “prioritize” or “urgent” in the directives.

[0054] An observer model 104, which may in some embodiments be a separate base model instance, receives these directives, then combines and translates them into actionable feedback for the base model. For example, if a directive indicates that the tone is too polite, the feedback might be, “The previous response was overly formal. Please adopt a more casual tone.”

[0055] A buffer 108 may be configured to act as a gatekeeper, as shown in FIG. 1A, determining whether a proposed action should be accepted. Each overlay 105 can be assigned a rigidity parameter (depicted as c) that defines how strictly the model must adhere to the rule. Essentially, in reference to the overhead projector analogy, this parameter controls the translucency of an overlay sheet on the overhead projector. Instead of enforcing a strict binary compliance — where actions either fully meet the overlays or not — rigidity offers a gradient of compliance or a buffer around proposed actions.

[0056] For highly rigid overlays, compliance is strictly enforced. If an action or response deviates from the specified rules, the base model 101 is required to regenerate new candidate actions until the response complies with the rule. This requirement ensures that only actions meeting the strict criteria are considered. For instance, if an overlay rule demands that responses must be empathetic, any response lacking empathy would lead to the base model 101 generating alternative responses that conform to this requirement.

[0057] Less rigid overlays allow the “overlay sheet” to be more translucent so that actions that partially meet the criteria can still be considered. The observer model 104 may rank or prioritize these partially compliant actions, accepting them within a permissible range. For example, if anAttorney docket # 047162-5394-00WO overlay requires responses to be empathetic, a response that shows limited empathy but is otherwise acceptable might still be chosen.

[0058] This flexibility helps manage the model’s load and processing time when correcting its actions. For non-critical conditions, low rigidity can be used, while critical conditions require higher rigidity. The buffer 108 can limit the number of action regeneration cycles to prevent excessive resource consumption while enforcing the necessary constraints, for example with a retry limit.

[0059] The observer utilizes the overlay descriptors and rigidity to create targeted feedback prompts 107 to the base model 101. Two types of feedback were incorporated:

[0060] Implicit feedback notes that the action is acceptable but offers constructive advice for improving subsequent actions. For example, if the actions are near compliance but not perfect, implicit feedback may recommend minor adjustments, such as modifying tone or phrasing. Suppose the base model 101 generates a response that is mostly empathetic but could be softer in tone. The implicit feedback might suggest: “Consider using a gentler tone in your responses.” This allows the base model 101 to refine its output in future iterations.

[0061] Forced feedback is employed when the base model’s actions significantly deviate from the overlay constraints. When the descriptors reveal substantial misalignment with the overlay rules, the observer generates a more directive prompt 107, instructing the base model 101 to focus on specific improvements until it fully complies with the constraints. The observer model 104 may issue several rounds of feedback 107 if needed until proposed actions meet the overlay requirements.

[0062] Overall, this feedback loop ensures that the base model 101 continually aligns with the overlays by translating its performance on specific rules into clear instructions.

[0063] To design the overlay rules, specific features were extracted based on response criteria emphasized in the literature: brevity, tone, specificity, and coherence. The rigidity and thresholds for the overlays were estimated using the dataset collected from the baseline study. Below, the methods for calculating these features are described, followed by a description of the feedback prompts generated by the observer.Atorney docket # 047162-5394-00WO

[0064] Brevity. Setting a limit on the length of the generated responses enhances the practicality and user-friendliness of the model, aligning with the natural flow of everyday conversations. To enforce this limit, in some embodiments, the observer module defines an expected number of completion tokens or other length constraints, such as character count or word count (OpenAI 2024). In some embodiments, the threshold for response length can range between 20 and 100 tokens for concise and contextually appropriate replies, for example with a default threshold set to 50 tokens. For more verbose tasks or scenarios requiring additional detail, the threshold may be extended to 150 tokens. Alternatively, the observer may apply adaptive length constraints based on conversational context or user preferences to ensure responses remain both relevant and appropriately concise.

[0065] Tone. The VADER model (Hutto and Gilbert 2014) for sentiment analysis was employed. The evaluation of tone and sentiment in a small talk response can be approached both per sentence and holistically. This dual approach provides a nuanced understanding of the response’s contribution to the conversational tone, addressing micro-level details and macro-level coherence. The relative weights of these scores was estimated using the baseline dataset and calculated a combined score (C) as follows:Equation 1

[0066] In this formula, H represents the overall score from VADER, and WH is the weight assigned to this overall score. The variable n denotes the number of sentences, while 5 / indicates the sentence-level score for the z-th sentence, with w, being the weight assigned to that specific sentence. The score C ranges from -1 to +1. A value between -0.5 and 0 signifies a neutral response, and from 0 to 1 indicates positivity — both are acceptable for a small talk response.

[0067] Specificity. Response specificity was assessed through Natural Language Toolkit’s (NLTK) named entity chunker and part-of-speech tagging (Bird, Klein, and Loper 2009). Counts of entities and descriptive words were normalized based on maximum expected counts, derived from human responses in the baseline data. For example, in certain embodiments, the maximumAtorney docket # 047162-5394-00WO thresholds for named entities (e.g., people, places, or organizations) can range between 3 and 7 entities per response, depending on the complexity of the context. Similarly, thresholds for descriptive words, such as adjectives or adverbs, may range between 10% and 30% of the total token count in a given response. For shorter responses (e.g., fewer than 50 tokens), entity counts might be capped at 2, with descriptive word proportions adjusted accordingly to maintain balance and relevance.

[0068] Coherence. To quantify coherence, each response was encoded into a sequence of tokens and derived embeddings using BERT (Devlin et al. 2018). The calculated entropy of token embeddings of a response captures the uncertainty and diversity at each conversational turn. Subsequently, information gain was gauged by considering the entropy of the previous response and the weighted average of the entropies in the current response.

[0069] Other Considerations. As noted in baseline study, it is the nature of LLMs to offer assistance. Yet, offers of help may result in conversations that sound too technical or formal. To mitigate this, the observer calculates the cosine similarity of embeddings to keywords of assistance, such as “help,” “assist,” “information,” “support,” “guide,” “instruction,” “advice,” “solve,” “fix,” “clarify,” “resolve,” “solution”, “answer”, and “explain.” The list of specified keywords was determined using the collected dataset. In some embodiments, a cosine similarity threshold of 0.7 or higher is used to identify and flag responses as overly helpful or assistance- oriented. Responses exceeding this threshold can be modified, rejected, or redirected to align with the conversational context. In more casual or informal settings, a stricter threshold can be applied to further reduce formal or technical tones.

[0070] Feedback. Timely responses are crucial for maintaining conversational flow, which requires balancing the detail and frequency of model updates during execution. When the base model generates a response that violates an overlay rule, such as being excessively verbose, the permissible buffer allows a gradation of compliance. For minor deviations, the buffer will allow the observer to synthesize the overlay directives to curate implicit feedback such as, “Your response was too lengthy; aim for a more concise reply while still addressing the topic.” This flexibility can accommodate slight variations while encouraging improvements, rather than forcing computationally heavy, drastic changes.Atorney docket # 047162-5394-00WO

[0071] In contrast, for significant deviations — such as off-topic or inappropriate content — the observer uses forced feedback: “Your response is off-topic and contains irrelevant content; provide a relevant and concise reply related to the current conversation. For example, [...].” Here, the permissible buffer rejects the action, and the base model is required to regenerate the response until it fulfills overlay rules. This approach ensures that the model adheres strictly in critical situations, while allowing for more flexibility in less severe deviations. To facilitate timely small talk, this forced feedback is used sparingly as determined by a random factor, with a maximum limit of three regeneration attempts.

[0072] Aspects of the present disclosure relate to a computer-implemented conversation system. Referring now to FIG. IB shown is an exemplary conversation system 200 comprising a base LLM 202 configured to receive a prompt from a user and provide a base response; an observer LLM 204 configured to evaluate the base response and provide a feedback prompt to the base LLM to produce a refined response; a sentiment analysis module 206 configured to calculate a sentiment score of the refined response; a coherence analysis module 208 configured to calculate a coherence score of the refined response; a formality analysis module 210 configured to calculate a formality score of the refined response; and a controller 212 configured to present the refined response to the user when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold.

[0073] In some embodiments, the observer LLM 204 is configured to provide the feedback prompt to summarize the base response when the base response exceeds a response length threshold. In some embodiments, the response length threshold is between 10 and 175 tokens, between 20 and 150 tokens, between 30 and 125 tokens, or between 40 and 100 tokens. In some embodiments, the sentiment analysis module 206 performs a VADER sentiment analysis. In some embodiments, the sentiment analysis score is between -1 and 1.

[0074] In some embodiments, the coherence analysis module 208 is configured to perform a BERT coherence analysis. In some embodiments, the coherence analysis module 208 is configured to: encode the refined response to a sequence of tokens; derive a set of embeddings of the sequence of tokens; and calculate an entropy of the token embeddings.Atorney docket # 047162-5394-00WO

[0075] In some embodiments, the formality analysis module 210 is configured to calculate a cosine similarity of the refined response to a set of assistance keywords. In some embodiments, the set of assistance keywords comprises at least one of “help,” “assist,” “information,” “support,” “guide,” “instruction,” “advice,” “solve,” “fix,” “clarify,” “resolve,” “solution”, “answer”, and “explain.”

[0076] Another exemplary computer-implemented conversation system is described. Referring now to FIG. 1C, shown is an exemplary conversation system 300 comprising a robot 302 comprising an audio input 310, an audio output 312, and a processor 304 communicatively connected to the audio input 310 and the audio output 312; a non-transitory computer-readable medium 306 communicatively connected to the processor 304 with instructions stored thereon, which when executed by the processor, perform steps comprising: recording an audio stream from the audio input 310; transcribing the audio stream to a text prompt; providing the text prompt to a base LLM to produce a base response; evaluating the base response with an observer LLM to provide a feedback prompt to the base LLM and produce a refined response; calculating a sentiment score of the refined response; calculating a coherence score of the refined response; calculating a formality score of the refined response; approving the refined response when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold; synthesizing an audio response from the approved refined response; and transmitting the audio response via the audio output 312.

[0077] In some embodiments, audio input 310 may be positioned on, or attached to the robot 302 and comprise at least one microphone, transducer, sound capturing device, and the like. However, audio input 310 may also be separate from the robot 302, such as positioned on or attached to a separate computer, device, tablet, handheld, wearable, and the like. In some embodiments, audio output 312 may be positioned on, or attached to the robot 302 and comprise at least one speaker, sound producing device, and the like. However, the audio output 312 may also be separate from the robot 302, such as positioned on or attached to a separate computer, device, tablet, handheld, wearable, and the like.

[0078] In some embodiments, the robot further comprises an image acquisition device 314, and wherein the instructions further comprise the step of calculating person detection and gazeAttorney docket # 047162-5394-00WO detection on a user of the conversation system 300. The image acquisition device 314 may be positioned on or attached to the robot 302, or may be separate from the robot 302 and be positioned on or attached to a separate computer, device, tablet, handheld, wearable, and the like. The image acquisition device 314 may comprise at least one camera, sensor (e.g., any of sensors 1065 disclosed herein), facial detection sensors, facial feature detection sensors, imaging sensors, proximity sensors, IR sensors, light sensors, motion sensors, activity sensors, and the like.

[0079] In some embodiments, the step of evaluating the base response with the observer LLM comprises providing the feedback prompt to summarize the response to the base LLM when a length of the base response exceeds a threshold. In some embodiments, the instructions further cause the processor 304 execute the step of providing a second feedback prompt to the base LLM to restate the base response more positively when the sentiment score is below the sentiment threshold. In some embodiments, the instructions further cause the processor to execute the step of providing a second feedback prompt to the base LLM to restate the base response more casually when the formality score of the base response exceeds the formality threshold.

[0080] In some embodiments, the instructions further cause the processor 304 to execute the step of providing a second feedback prompt to the base LLM to restate the base response more clearly when the coherence score of the base response is below the coherence threshold. In some embodiments, the base LLM and the observer LLM use a same architecture, a similar architecture, or different architectures.

[0081] Another exemplary computer-implemented grounded observer system is described. In some embodiments, a grounded observer system comprises a base machine learning model configured to receive a base prompt and produce a base response; a feature extractor configured to extract contextual features from the base response; an overlay rules module having a set of overlay rules to apply to the contextual features, each overlay rule having a rigidity measure, the overlay rules module configured to produce a set of overlay descriptors; and a buffer comprising an observer machine learning model, configured to receive the overlay descriptors and the base response, and provide a feedback directive to the LLM based on the base response and the overlay descriptors, wherein the feedback directive is configured to alter the base response to better conform to at least one of the overlay rules.Attorney docket # 047162-5394-00WO

[0082] In some embodiments, when the rigidity measure is strict, the corresponding overlay rule is strictly applied, and wherein when the rigidity measure is lax, the corresponding overlay rule is flexibly applied. In some embodiments, the observer machine learning model and the base machine learning model are LLMs. In some embodiments, at least one of the overlay rules comprises an “if this, then that” rule.

[0083] In some embodiments, the overlay rules module is configured to store and apply multiple categories of overlay rules, including but not limited to safety rules, stylistic rules, domainspecific rules, and user-preference rules. Safety rules may be configured to prevent the base machine learning model from generating responses that are harmful, offensive, or otherwise disallowed in a particular deployment context. Stylistic rules may enforce high-level conversational properties such as brevity, tone, or level of formality. Domain-specific rules may constrain the base response to remain consistent with professional norms or regulatory requirements in domains such as healthcare, finance, or education. User-preference rules may be configured to adapt the base response to user-specific preferences, such as favoring humor, avoiding certain topics, or aligning with a preferred conversational style. In some embodiments, each category of overlay rules can be assigned a respective priority such that higher-priority rules are applied before, or override, lower-priority rules when conflicts arise.

[0084] Any disclosed system may operate in a staged fashion in which the base machine learning model (e.g., LLM) first generates one or more candidate base responses for a given base prompt, and a feature extractor subsequently derives contextual features from each candidate response and, optionally, from the surrounding conversation history. The overlay rules may then apply a set of overlay rules to the contextual features to produce overlay descriptors indicating, for each candidate response, whether the response satisfies the corresponding rule, violates the rule, or lies within an acceptable tolerance range defined by the rigidity measure. A buffer and observer machine learning model (e.g., LLM) can then use overlay descriptors to select one of the candidate responses, request a modified response, or instruct the base model to regenerate a new candidate response that is more likely to satisfy the overlay rules. In some embodiments, the observer machine learning model is further configured to generate structured feedback directives that explicitly reference the violated or partially satisfied rules, thereby guiding targeted adjustments rather than unconstrained regeneration.Attomey docket # 047162-5394-00WO

[0085] Any disclosed system may maintain one or more context representations that encode user state, session history, or environment conditions, wherein overlay rules and the observer machine learning model are configured to condition their operation on this context. For example, for a user who has indicated a preference for more serious conversation, an overlay rule may tighten thresholds associated with playful or overly casual responses, while for a user who prefers lighthearted exchanges, the same overlay rule may be relaxed. In some embodiments, the rigidity measures and thresholds associated with the overlay rules are updated over time based on observed user reactions, explicit ratings, or external signals such as task completion, thereby enabling the system to adapt to individual users or cohorts of users while still enforcing global safety or policy constraints.

[0086] Any disclosed system may be configured to dynamically adjust one or more overlay rules based on real-time gaze detection of the user. For example, when the image acquisition device detects sustained user gaze toward the robot or display — indicating heightened engagement or attentiveness — the system may relax brevity constraints, increase permissible specificity, or reduce tone strictness to allow richer conversational depth. Conversely, when the user’s gaze drifts away, rapidly shifts, or exhibits patterns associated with distraction or discomfort, the system may tighten brevity rules, increase tone sensitivity, or emphasize empathetic phrasing to reestablish rapport and avoid overwhelming the user. In some embodiments, gaze-derived signals modify rule rigidity parameters, threshold values, or the prioritization of particular overlay rules in real time, enabling the system to adapt conversational behavior to subtle social cues and maintain contextually appropriate, user-aligned interactions.

[0087] In some embodiments, the overlay rules are configured to be updated or extended without retraining the base machine learning model. For instance, new overlay rules can be introduced in response to emerging safety guidelines, institutional policies, or domain best practices, and can be deployed as configuration artifacts or rule scripts that the overlay rules module interprets at runtime. In some embodiments, logs of past interactions, including base responses, overlay descriptors, observer feedback directives, and final delivered responses, are stored and used to refine the overlay rules and to train or fine-tune the observer machine learning model. In some embodiments, supervised learning, reinforcement learning, red-teaming workflows, or human-in-Atorney docket # 047162-5394-00WO the-loop curation may be used to iteratively improve the quality of overlay rules and feedback directives so that the system remains aligned with desired behavioral specifications over time.

[0088] Any disclosed system may be configured to operate over multimodal inputs and outputs, including text, audio, images, video, or sensor streams. In some embodiments, a feature extractor may comprise multiple sub-extractors specialized for different modalities, such as a text encoder for language outputs, a vision encoder for generated images or video frames, and sensor encoders for signals originating from a robot or other hardware platform. Overlay rules can be defined over one or more modalities simultaneously — for example, a rule may require that the textual content of a response be supportive while the prosody (e.g., intonation, stress, rhythm, loudness) of a synthesized audio output remains calm, or that a robot’s physical gestures remain within a predefined set of socially acceptable motions when delivering a verbal response. In some embodiments, the observer machine learning model is configured to generate feedback directives that jointly modify textual content, prosody parameters, and actuator-level commands, thereby shaping the overall multimodal behavior of the agent while respecting the overlay rules.

[0089] In some embodiments, gaze detection of the user may also be used to dynamically adjust the robot’s own nonverbal behaviors, such as gaze direction, facial expressions, and movement patterns. For example, when the system detects that the user is making direct eye contact, the robot may responsively shift its gaze toward the user, increase gaze dwell time, or perform subtle head-orientation adjustments to signal attentiveness. In some embodiments, detection of positive user engagement — such as sustained or friendly gaze — may trigger the robot to display visually expressive behaviors, including smiling, softening its digital facial features, or performing gentle body motions that convey warmth or rapport. Conversely, when user gaze indicates discomfort, distraction, or aversion, the robot may look away briefly, reduce movement intensity, pause expressive behavior, or adopt a more neutral posture to avoid appearing intrusive or overwhelming. In this manner, user gaze data can serve as a continuous, real-time input that shapes the robot’s nonverbal expressiveness and enhances the overall social naturalness of the interaction.

[0090] Any disclosed system may be configured to perform continual learning to refine its conversational abilities based on user interactions, context, and historical patterns. In some embodiments, the system maintains user-specific interaction profiles that encode long-termAtorney docket # 047162-5394-00WO preferences, including preferred tone, verbosity, conversational pacing, topic affinity, and sensitivity to certain subjects. These profiles may be updated incrementally using online learning techniques, reinforcement learning signals, or supervised labels inferred from user engagement (e.g., response time, sentiment change, dwell time, or explicit ratings). Over time, the system may adapt the rigidity of overlay rules, modify threshold values, or update the weighting of extracted features to better align the agent’s behavior with observed user preferences while still maintaining overarching safety and guardrail constraints.

[0091] Any disclosed system may further comprise a prediction engine configured to estimate forthcoming conversational states or user behaviors. For example, the prediction engine may be configured to predict whether a user is about to disengage from the conversation, whether a particular response will be perceived as overly specific, overly helpful, or emotionally mismatched, or whether a shift in topic would increase user comfort or conversational flow. In some embodiments, the predictions are generated by analyzing trends in a rolling window of contextual features, such as sentiment trajectories, lexical diversity, conversational turn-taking rhythm, changes in prosody or gaze patterns, or decline in user responsiveness. The system may use these predictions to proactively adjust overlay rules, generate pre-emptive feedback directives, or suggest alternative conversational strategies to the base model before undesirable behavior emerges.

[0092] Any disclosed system may be configured to generate recommendations for future responses, topics, or conversational directions. For example, the observer machine learning model may recommend that the base model steer toward broader, more open-ended topics when coherence metrics indicate conversational drift, or toward lighter, supportive language when the sentiment trajectory suggests decreasing positivity. In some embodiments, the system may recommend that the base model adopt shorter responses when historical data indicates that a particular user prefers concise exchanges, or may increase specificity when engaging with a user who has demonstrated sustained interest in detailed content. These recommendations may be encoded as structured feedback directives that reference prior rule violations, predicted user states, or long-term conversational patterns.

[0093] In some embodiments, recommendations may also target higher-level conversational strategies, such as pacing adjustments, topic transitions, and social signaling behaviors. ForAttorney docket # 047162-5394-00WO instance, the observer may recommend delaying follow-up questions to avoid overwhelming a user, or may suggest echoing parts of the user’s prior statements to signal active listening and improve rapport. In some embodiments, recommendations may further include adjusting multimodal signals of a robot such as gaze direction, body orientation, movement timing, or prosody to align verbal and nonverbal behavior. In some embodiments, these recommendations may be prioritized based on rule categories (e.g., safety > tone > brevity > personalization) or based on predicted user sensitivity to interaction style.

[0094] Any disclosed system may further comprise a long-term learning module configured to identify emerging conversational patterns across many users and interactions. This module may cluster conversation trajectories, detect recurring failure modes, or infer new conversational principles that should be incorporated into overlay rules. For example, if repeated interactions reveal that users frequently disengage when the system provides more than two consecutive questions, a new overlay rule may be automatically proposed to limit question frequency. Alternatively, if long-term data indicates that users respond positively to small doses of humor within otherwise neutral conversations, the system may learn to recommend incorporating mild humor when appropriate. In some embodiments, these learned patterns may be validated through human-in-the-loop review, simulated conversations, or red-team testing before being deployed into production.

[0095] In some embodiments, the observer model is configured to make meta-predictions about its own performance, such as predicting whether a planned feedback directive will meaningfully improve compliance with overlay rules or whether regeneration is likely to produce a more desirable response. When the meta-prediction indicates low confidence in the effectiveness of a directive, the system may escalate the rigidity of relevant rules, introduce additional constraints, or reduce the number of allowable regeneration attempts. Conversely, when the meta-prediction indicates high confidence, the system may adopt more flexible or exploratory feedback strategies. This self-reflective learning helps ensure that the grounded observer operates efficiently, avoids unnecessary computation, and maintains conversational smoothness.

[0096] Aspects of the disclosure relate to a user interface (UI) for interacting with and controlling any disclosed system and / or robot. Referring now to FIG. ID, shown is an exemplary graphical user interface (GUI) 400 according to aspects of the present disclosure. In someAtorney docket # 047162-5394-00WO embodiments, GUI 400 is accessible via a web browser, desktop application, mobile application, or a display integrated into a robot or companion device. In some embodiments, GUI 400 comprises a menu bar 402, a content area 404, and one or more interface elements 406 configured to visualize and control any portion of the systems and methods and / or robots disclosed herein. In some embodiments, the menu bar 402 provides navigation options 408 for selecting between users, robots, system status views, overlay rule configurations, user preference settings, historical conversation logs, or real-time monitoring dashboards. In some embodiments, the menu bar 402 further allows a user to switch between textual, audio, or multimodal monitoring modes.

[0097] In some embodiments, the content area 404 is configured to display real-time information about the system’s operation, such as the current base response, observer feedback prompt, extracted contextual features, overlay descriptors, or rigidity values associated with the currently applied overlay rules. In some embodiments, content area 404 may additionally display a visual representation 410 of a robot, avatar, or conversational agent whose behavior is governed by any disclosed system. This representation may show gaze direction, head tilt, gesture intent, or prosodic indicators, allowing a user or operator to view how the agent intends to physically or verbally respond in a real-world or simulated interaction. In some embodiments, content area 404 further displays graphical summaries of sentiment scores, coherence scores, formality scores, or human-likeness metrics measured across recent interactions.

[0098] In some embodiments, interface elements 406 comprise one or more adjustable elements 412 such as slider bars, toggles, dropdown menus, or editable rule fields that allow an operator to adjust any adjustable feature, parameter, or threshold disclosed herein. For example, adjustable elements 412 may include sliders to adjust overlay rigidity values, token-length thresholds, sentiment positivity thresholds, coherence entropy thresholds, or acceptable ranges of assistancekeyword similarity. In some embodiments, interface elements 406 may additionally control motion parameters for a robot such as gesture amplitude, gaze dwell time, head-motion smoothing, conversational pacing, or allowable movement regions. In some embodiments, adjustment of these interface elements 406 causes real-time updates to the system, robot, observer machine learning model and / or overlay rules such that the system adapts immediately to the newly selected configuration.Atorney docket # 047162-5394-00WO

[0099] In some embodiments, GUI 400 further comprises interactive visualizations that reflect the effect of these adjustments, including live previews of regenerated responses, annotations highlighting which overlay rules were triggered, and color-coded bars showing compliance or deviation levels. In certain embodiments, GUI 400 enables an operator to simulate hypothetical user prompts or environmental conditions and observe how the grounded observer system would evaluate, modify, and refine the base response. In some embodiments, GUI 400 may additionally provide tools for creating and / or saving user profiles, add / removing robots, saving rule configurations, importing new overlay rule sets, or deploying updated behavior profiles to one or more robots or conversation systems in networked environments.Computing Device

[0100] As described above, the disclosed systems and methods may operate at least in part with a UI and / or GUI that may at least partially reside on a computing device, such as computer 1000 disclosed herein. It should be appreciated that UI, GUI, modules, and / or features of the disclosed systems and methods may be represented on a display as an interface, graphical buttons and / or representations, images, results, predictions, recommendations, and the like.

[0101] Accordingly, the aforementioned systems and methods may include computing devices communicatively and / or operatively connected to the systems for performing one or more steps of any of the disclosed methods. For example, in some embodiments, the computing devices may be configured to enable deep learning, machine learning and / or artificial intelligence with one or more networks (e.g., neural networks). In some aspects of the present invention, software executing the instructions provided herein may be stored on a non-transitory computer-readable medium, wherein the software performs some or all of the steps of the present invention when executed on a processor.

[0102] Aspects of the invention relate to algorithms executed in computer software. Though certain embodiments may be described as written in particular programming languages, or executed on particular operating systems or computing platforms, it is understood that the system and method of the present invention is not limited to any particular computing language, platform, or combination thereof. Software executing the algorithms described herein may be written in any programming language known in the art, compiled or interpreted, including butAtorney docket # 047162-5394-00WO not limited to C, C++, C#, Objective-C, Java, JavaScript, MATLAB, Python, PFTP, Perl, Ruby, or Visual Basic. It is further understood that elements of the present invention may be executed on any acceptable computing platform, including but not limited to a server, a cloud instance, a workstation, a thin client, a mobile device, an embedded microcontroller, a television, or any other suitable computing device known in the art.

[0103] Parts of this invention are described as software running on a computing device. Though software described herein may be disclosed as operating on one particular computing device (e.g. a dedicated server or a workstation), it is understood in the art that software is intrinsically portable and that most software running on a dedicated server may also be run, for the purposes of the present invention, on any of a wide range of devices including desktop or mobile devices, laptops, tablets, smartphones, watches, wearable electronics or other wireless digital / cellular phones, televisions, cloud instances, embedded microcontrollers, thin client devices, or any other suitable computing device known in the art.

[0104] Similarly, parts of this invention are described as communicating over a variety of wireless or wired computer networks. For the purposes of this invention, the words “network”, “networked”, and “networking” are understood to encompass wired Ethernet, fiber optic connections, wireless connections including any of the various 802.11 standards, cellular WAN infrastructures such as 3G, 4G / LTE, or 5G networks, Bluetooth®, Bluetooth® Low Energy (BLE) or Zigbee® communication links, or any other method by which one electronic device is capable of communicating with another. In some embodiments, elements of the networked portion of the invention may be implemented over a Virtual Private Network (VPN).

[0105] FIG. 5 and the following discussion are intended to provide a brief, general description of a suitable computing environment in which the invention may be implemented. While the invention is described above in the general context of program modules that execute in conjunction with an application program that runs on an operating system on a computer, those skilled in the art will recognize that the invention may also be implemented in combination with other program modules.

[0106] Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types.Atorney docket # 047162-5394-00WOMoreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

[0107] FIG. 5 depicts an illustrative computer architecture for a computer 1000 for practicing the various embodiments of the invention. The computer architecture shown in FIG. 5 illustrates a conventional personal computer, including a central processing unit 1050 (“CPU”), a system memory 1005, including a random access memory 1010 (“RAM”) and a read-only memory (“ROM”) 1015, and a system bus 1035 that couples the system memory 1005 to the CPU 1050. A basic input / output system containing the basic routines that help to transfer information between elements within the computer, such as during startup, is stored in the ROM 1015. The computer 1000 further includes a storage device 1020 for storing an operating system 1025, application / program 1030, and data.

[0108] The storage device 1020 is connected to the CPU 1050 through a storage controller (not shown) connected to the bus 1035. The storage device 1020 and its associated computer- readable media provide non-volatile storage for the computer 1000. Although the description of computer-readable media contained herein refers to a storage device, such as a hard disk or CD- ROM drive, it should be appreciated by those skilled in the art that computer-readable media can be any available media that can be accessed by the computer 1000.

[0109] By way of example, and not to be limiting, computer-readable media may comprise computer storage media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data.Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, orAtorney docket # 047162-5394-00WO any other medium which can be used to store the desired information and which can be accessed by the computer.

[0110] According to various embodiments of the invention, the computer 1000 may operate in a networked environment using logical connections to remote computers through a network 1040, such as TCP / IP network such as the Internet or an intranet. The computer 1000 may connect to the network 1040 through a network interface unit 1045 connected to the bus 1035. It should be appreciated that the network interface unit 1045 may also be utilized to connect to other types of networks and remote computer systems.

[0111] The computer 1000 may also include an input / output controller 1055 for receiving and processing input from a number of input / output devices 1060, including a keyboard, a mouse, a touchscreen, a camera, a microphone, a controller, a joystick, or other type of input device. Similarly, the input / output controller 1055 may provide output to a display screen, a printer, a speaker, or other type of output device. The computer 1000 can connect to the input / output device 1060 via a wired connection including, but not limited to, fiber optic, Ethernet, or copper wire or wireless means including, but not limited to, Wi-Fi, Bluetooth, Near-Field Communication (NFC), infrared, or other suitable wired or wireless connections.

[0112] As mentioned briefly above, a number of program modules and data files may be stored in the storage device 1020 and / or RAM 1010 of the computer 1000, including an operating system 1025 suitable for controlling the operation of a networked computer. The storage device 1020 and RAM 1010 may also store one or more applications / programs 1030. In particular, the storage device 1020 and RAM 1010 may store an application / program 1030 for providing a variety of functionalities to a user. For instance, the application / program 1030 may comprise many types of programs such as a word processing application, a spreadsheet application, a desktop publishing application, a database application, a gaming application, internet browsing application, electronic mail application, messaging application, and the like. According to an embodiment of the present invention, the application / program 1030 comprises a multiple functionality software application for providing word processing functionality, slide presentation functionality, spreadsheet functionality, database functionality and the like.Atorney docket # 047162-5394-00WO

[0113] The computer 1000 in some embodiments can include a variety of sensors 1065 for monitoring the environment surrounding and the environment internal to the computer 1000. These sensors 1065 can include a Global Positioning System (GPS) sensor, a photosensitive sensor, a gyroscope, a magnetometer, thermometer, a proximity sensor, an accelerometer, a microphone, biometric sensor, barometer, humidity sensor, radiation sensor, or any other suitable sensor.EXPERIMENTAL EXAMPLES

[0114] The invention is further described in detail by reference to the following experimental examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Thus, the invention should in no way be construed as being limited to the following examples, but rather, should be construed to encompass any and all variations which become evident as a result of the teaching provided herein.

[0115] Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the system and method of the present invention. The following working examples therefore, specifically point out the exemplary embodiments of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.Observer-enabled Small Talk

[0116] Imagine a modern care home for the elderly where a state-of-the-art robotic assistant, designed to enhance residents’ well-being, manages routine healthcare tasks. Alex, a resident, seeks a connection beyond the daily routine and attempts to chat with the robot:

[0117] ALEX: Hi CareBot, how’s it going? BOT: Hello. How may I help you? ALEX: Oh, just making conversation. Anything interesting happen in your world? BOT : I have access to a vast database of news articles. Would you like information on a specific topic ALEX: No, never mind that. The weather will be nice this weekend. How would you spend it? BOT: The weather forecast expects daytime highs around 75°F and comfortable evening lows of 60°F...Atorney docket # 047162-5394-00WO

[0118] Today, an essential component of designing intelligent systems is to imbue some level of speech, language understanding, and conversational behavior (Shieber 2004; Fu et al. 2022). Despite the potential for these intelligent agents to elicit meaningful interactions, the dialogue between Alex and the robot exemplifies a common shortcoming. Alex initiates a friendly exchange, expressing a desire for casual conversation with the robotic assistant.

[0119] However, the robot, proficient in providing information, struggles to reciprocate the informal nature of the dialogue. Instead, the robot redirects the conversation towards its programmed functionalities, offering information and task-oriented assistance.

[0120] Although the boundaries of types of conversation are always uncertain, “small talk” has a recognized currency in several traditions of sociolinguistics and communication studies (Coupland 2014). It can be defined as a generally informal and light-hearted conversation with a social purpose aimed at building or sustaining interpersonal connections rather than conveying substantial information. Yet, small talk does not have a strict formula, as it is inherently flexible and context-dependent. This fluid nature presents a significant challenge for current-day LLMs, which often rely on structured and well-defined question-answer patterns.

[0121] The literature emphasizes distinct characteristics of small talk (Laver 1981; Eggins and Slade 2004). One key aspect is brevity, where responses are typically concise, avoiding unnecessary elaboration or verbosity. Another essential characteristic is tone; responses maintain a light and informal tone, steering clear of negativity, complaints, or contentious topics. Nonspecificity is a hallmark of small talk, as it revolves around broad, accessible topics, deliberately avoiding highly specific details. Finally, despite its nonspecific nature, small talk maintains thematic coherence, staying contextually relevant and focusing on related topics or themes to avoid disjointed elements. The delicate balance among these characteristics highlights both the nuances and the fundamental principles of effective small talk.

[0122] A skilled conversationalist not only learns their partner’s preferences over time but also adapts to them in real time, using naturalistic cues that may be linguistic, implicit, and contextual. For intelligent agents, this means they must swiftly adjust their policies in response to high-level, imprecise, or evolving directives conveyed through natural language. Therefore, a proof-of-concept case study of how a grounded observer can dynamically shape an agent’sAtorney docket # 047162-5394-00WO behavior while adhering to high-level directives in a highly subjective social context was presented.Current Landscape of LLM Small Talk

[0123] To establish a baseline, an initial study was conducted on small talk with existing LLMs. Three volunteers engaged in 50 conversations each with three distinct state-of-the-art LLMs. Each model had the initial system prompt describing the role as a “friendly companion who engages in casual, small talk”, with the prior listed criteria definitions.

[0124] The LLMs were GPT-3.5 (Brown et al. 2020), selected for its large-scale language generation capabilities, Gemini Pro (Team et al. 2023), selected for its context-aware bidirectional approach, and LLaMA-2 (Touvron et al. 2023), selected as an autoregressive transformer model fine-tuned on prompt-response pairs.

[0125] The order in which the participants used the LLMs was randomized to mitigate potential order effects. Additionally, conversations lasted at least ten turns, and the interactions occurred over 15 days to allow for conversational variability. The participants engaged with each LLM through a command line interface, unaware of the LLM’s name to prevent bias from prior knowledge or familiarity. Following each conversation, assistants rated the ease of each conversation and provided open-ended feedback. Additionally, two research assistants annotated the dataset. Raters were blind to the response speaker and evaluated responses based on recognized small talk criteria: brevity, tone, specificity, and coherence on 5-point Likert scales.

[0126] A total of 150 conversations were transcribed, yielding an average of 10.31 responses per conversation (SD = 1.13). This led to a total of 1547 annotated responses. Due to the inherent ambiguity of criteria evaluation, the inter-rater reliability for a randomly selected subset of 20 conversations was calculated, constituting 13.3% of the total dataset. Inter-rater reliability was calculated using contingency tables, employing Cohen’s Kappa (K), with the observed agreement and the distribution of ratings for each rater. The resulting K values were 0.81 for brevity, 0.78 for tone, 0.74 for specificity, and 0.65 for coherence.Attorney docket # 047162-5394-00WO

[0127] Paired dependent t-tests were used to assess the differences between the agents’ and humans’ responses across the small talk criteria. A conventional a of 0.05 was employed, and resulting p-values were Holm-corrected to control the familywise error rate.

[0128] The results revealed significant differences in brevity (t = 86.78, p < 0.0001), tone (t = 1.70, p = 0.04), specificity (t = 58.06, p < 0.0001), and thematic coherence (t = -55.72, p < 0.0001) between the agent and human responses. This suggests that LLM-generated small talk responses were notably less concise, somewhat more positive, more specific, and less thematically coherent compared to human responses. The degree of similarity between LLM behavior and human responses was summarized by computing the absolute difference in their average scores across these dimensions within each conversation. The “human-likeness” of each LLM is illustrated in FIG. 2, where 0 represents no difference at all and 4 is the highest absolute difference between human and LLM responses.

[0129] Mixed-effects modeling was used to explore whether LLMs’ poor performance in small talk results from “forgetfulness” of the initial prompt. This model analyzed the relationship between the response sequence index and the outcome variables, accounting for the conversation identifier and LLM name as random effects to address the nested data structure.

[0130] For brevity, a significant positive coefficient (P = 0.10, p <0.001) indicated increased wordiness of the agents’ responses as the conversation progressed. Specificity showed a significant positive association ( = 0.11, p <0.001), indicating the agents’ responses become more specific over time. Coherence showed a significant negative coefficient (P = -0.10, p < 0.001), suggesting the agents became less coherent over time. Tone did not exhibit a significant relationship with the response index (P = 0.00, p > 0.05).

[0131] Open-ended feedback highlighted participants’ difficulties in conversing with LLMs, which was categorized into four themes through informal thematic analysis. Often, conversations (59%, N =89 out of 150 conversations) ended abruptly or felt forced, with one user commenting, “The bot didn’t encourage more conversation than I expected. I’m not sure how to continue in a way that doesn’t feel forced.” Additionally, 51% (N = 77) of conversations featured multiple questions or rapid topic shifts, leading to confusion. One participant noted, “It was hard to follow because the bot asked so many questions and touched upon so many different topics in the sameAttorney docket # 047162-5394-00WO response.” Emotional loops affected 23% (N = 34) of conversations, where LLMs intensified emotional aspects without appropriate transitions. As one user stated, “I felt that the bot was leading the conversation down a rabbit hole.” Finally, 68% (N =102) of conversations involved excessive advice or detailed information, which felt like reprimands rather than balanced dialogue. A user remarked, “I felt I was reprimanded for conveying an opinion.” These issues highlight the need for strategies to ensure small talk interactions are coherent, balanced, and contextually appropriate.

[0132] It is evident from the results above that there is a disparity in how LLMs maintain conversational momentum versus what is expected or exhibited by human speakers. Building on these insights, the grounded observer framework was applied to develop agents adept at sustaining small talk. Two instances of GPT-3.5 were employed, one as the base model and the other as the observer, because of the base models tested, GPT-3.5 performed relatively well (see FIG. 2). By using the same base model prompt, the performance of an observer-enabled system can be compared against the baseline results, assessing how improvements can be achieved despite the same base model configuration.Chatbot Interactions

[0133] The participants in the baseline study engaged in 50 small talk conversations with the system using the base model with the observer model, under the same experimental protocol. A total of 50 conversations with the observer model were transcribed, yielding 499 responses with an average of 9.98 responses per conversation (SD = 0.14). Of the 250 generated responses, 106 (42.4%) responses were flagged by the observer with implied feedback, and 14 (5.6%) of responses triggered forced feedback for a total of 23 regeneration attempts.

[0134] It was explored whether the observer’s redirection was effective at improving the LLM’s small talk behavior. To compare the responses of ChatGPT-3 5 (base model) in the baseline study to that with the observer-enabled system, the “human-likeness” of generated responses was calculated as described in the baseline along the four small talk criteria.

[0135] The Wilcoxon method with Holm-corrected significances indicates that the observer responses were significantly more human-like in that they were more concise (Z = -8.17, p <0.0001), positive (Z =4.53, p <0.0001), less specific (Z =-6.76, p <0.0001), and moreAtorney docket # 047162-5394-00WO thematically coherent (Z =4.53, p <0.0001) than the responses of the base system. Furthermore, a Brown-Forsythe test on the sum of differences across small talk criteria indicates significantly less variability in human-likeness for the observer model than the base model (F' = 15.47, p < 0.0001). As summarized in FIG. 3, the observer responses were more human-like across the criteria than the responses of the base model.Robot Interactions

[0136] Agents should have the ability to engage effectively not only in virtual, text-based interactions but also in real-world, dynamic scenarios with real users. Hence, an observer- enabled robot was developed to explore how well the system navigates the nuances of novel, face-to-face interactions.

[0137] As shown in FIG. 4, the Jibo robot was used which stands 11 inches tall and has 3 full- revolute axes designed for 360-degree movement. Personified behaviors such as naturalistic gaze and body movement were coordinated with Jibo’s onboard capabilities. Additionally, a modular software architecture was implemented within the ROS framework to allow for components of the small talk system to be fully autonomous.

[0138] A within-subjects case study was conducted where 25 volunteer participants, 15 men and 10 women, ages 19 to 45 (M = 25.2, SD = 7.4), interacted with the base-only and observer- enabled system for three conversations each. Each conversation spanned a minimum of eight turns, and the order in which participants interacted with the two models was randomized. This protocol yielded 150 conversations of 1725 responses in -16.8 hours of interaction, 40.5 minutes (SD = 10.2) per participant. Following interactions with each model, participants provided open- ended feedback. An informal thematic analysis was then conducted and participant feedback was ultimately grouped into two primary themes.

[0139] Response Content. 21 participants expressed dissatisfaction with the base model’s responses, noting its overly assistive and verbose tendencies, which led to conversations described as “rambling”, “dry”, and “like speaking to a wall.” One participant expressed frustration with the model’s tendency to prioritize assistance over engaging in genuine conversation, stating, “Even when I spoke about my own interests, it only cared about giving me help like I was a child always in need of help...” On the other hand, when augmented with theAttorney docket # 047162-5394-00WO observer model, 23 participants remarked on how “relevant,” “human-like,” and “natural” the robot’s responses were. For example, one participant stated that the robot, “engaged in small talk better than most of my friends would.”

[0140] Embodied Form. 13 participants described the impact of the physical robot form on the quality of conversation. The feedback was mostly positive, highlighting that Jibo’s “animated” and “life-like” movements made it “more than a toy” across conditions. Yet, three participants remarked on a lack of personality: “[I]t’s a bit misleading that it has a body and eyes and life-like movements but doesn’t have a personality or experiences to share.”

[0141] This exploratory study aimed to reveal users’ broad perceptions of the system, demonstrating that good small talk behavior is inherently emergent and highlighting the success of the observer-enabled system.Discussion

[0142] Building on robotic action selection techniques, the grounded observer was introduced as a framework for aligning foundation models with desired outcomes in socially sensitive domains. This example demonstrates this approach’s usefulness by developing agents capable of seamless, contextually relevant casual conversation. In the exploratory studies, gaps in existing LLMs’ small talk capabilities were identified and then a base LLM was enhanced with an observer. This enhancement significantly improved the LLM’s ability to follow small talk conventions, leading to more engaging and socially appropriate interactions in both virtual textbased chats and spontaneous face-to-face conversations.

[0143] While the design and internal representation of different models and robotic platforms may vary, the concept of enabling an agent to observe its own compliance goes beyond specific implementations like GPT-3.5 or Jibo. The grounded observer may be generalized across various platforms and behavioral contexts.

[0144] For example, the increasing use of academic tutoring systems (Lin, Huang, and Lu 2023) introduces unique social risks (Fischer et al. 2013) such as the potential for an agent to provide feedback that is overly harsh, too lenient, or even misleading, which could negatively impact students’ learning and self-esteem. To mitigate these risks, overlay rules grounded inAtorney docket # 047162-5394-00WO pedagogical principles can be developed (Price et al. 2010), ensuring feedback remains supportive, specific, and tailored to the student’s progress. These rules are analogous to the small talk criteria established in this example. An observer-enabled tutoring agent could dynamically adjust its feedback to foster a positive and effective learning environment, while minimizing socially inappropriate responses.

[0145] The grounded observer framework offers significant advantages in scalability and structure, but challenges remain in the design and implementation of overlay rules. Accurately capturing nuanced behaviors is critical, as misaligned rules can lead to ineffective or inappropriate responses. Developing systematic methods for refining these rules — such as inferring rules from datasets, red-team testing (Hong et al. 2024), or other methodologies (Bommasani et al. 2021) — is essential. Additionally, synthesizing effective overlay directives remains more art than science, underscoring the need for quantitative methods to evaluate the quality of generated prompts and to create reliable templates for observer generated behavior. This could manifest, for example, as designing overlays for the observer’s own behavior, essentially embedding quality evaluations into the agent itself.

[0146] In all, the grounded observer framework represents a step toward establishing robust guardrails for foundation models in dynamic, unstructured, and socially sensitive contexts.

[0147] The disclosures of each and every patent, patent application, and publication cited herein are hereby incorporated herein by reference in their entirety. While this invention has been disclosed with reference to specific embodiments, it is apparent that other embodiments and variations of this invention may be devised by others skilled in the art without departing from the true spirit and scope of the invention. The appended claims are intended to be construed to include all such embodiments and equivalent variations.

[0148] The following publications are incorporated herein by reference in their entirety.

[0149] Alshiekh, M.; Bloem, R.; Ehlers, R.; Konighofer, B.; Niekum, S.; and Topcu, U. 2018. Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence, volume 32.Atorney docket # 047162-5394-00WO

[0150] Bird, S.; Klein, E.; and Loper, E. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O’Reilly Media, Inc.".

[0151] Biza, O.; Wang, D.; Platt, R.; van de Meent, J.-W.; and Wong, L. L. 2021. Action priors for large action spaces in robotics. arXiv preprint arXiv:2101.04178.

[0152] Bommasani, R.; Hudson, D. A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M. S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

[0153] Brawer, J.; Ghose, D ; Candon, K.; Qin, M.; Roncone, A.; Vazquez, M.; and Scassellati, B. 2023. Interactive policy shaping for human-robot collaboration with transparent matrix overlays. In Proceedings of the 2023 ACM / IEEE International Conference on Human-Robot Interaction, 525- 533.

[0154] Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J. D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877- 1901.

[0155] Capgemini. 2023. Why Consumers Love Generative Al: Report from the Capgemini Research Institute. Accessed: 2024-08-08.

[0156] Chen, J.; Liu, Z.; Huang, X.; Wu, C.; Liu, Q.; Jiang, G.; Pu, Y.; Lei, Y.; Chen, X.; Wang, X.; et al. 2024. When large language models meet personalization: Perspectives of challenges and opportunities. World Wide Web, 27(4): 42.

[0157] Chen, J.-T.; and Huang, C.-M. 2023. Forgetful large language models: Lessons learned from using LLMS in robot programming. In Proceedings of the AAAI Symposium Series, volume 2, 508-513.

[0158] Coupland, J. 2014. Small talk. Routledge.

[0159] Denny, P.; Kumar, V.; and Giacaman, N. 2023. Conversing with copilot: Exploring prompt engineering for solving csl problems using natural language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1, 1136-1142.Atorney docket # 047162-5394-00WO

[0160] Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv: 1810.04805.

[0161] Dong, Y.; Mu, R.; Jin, G.; Qi, Y.; Hu, J.; Zhao, X.; Meng, J.; Ruan, W.; and Huang, X. 2024. Building guardrails for large language models. arXiv preprint arXiv:2402.01822.

[0162] Eggins, S.; and Slade, D. 2004. Analysing casual conversation. Equinox Publishing Ltd.

[0163] Fischer, K.; Lohan, K. S.; Nehaniv, C.; and Lehmann, H.

[0164] 2013. Effects of different kinds of robot feedback. In Social Robotics: 5th International Conference, ICSR 2013, Bristol, UK, October 27-29, 2013, Proceedings 5, 260-269. Springer.

[0165] Fu, T.; Gao, S.; Zhao, X.; Wen, J.-r.; and Yan, R. 2022. Learning towards conversational Al: A survey. Al Open, 3: 14-28.

[0166] Garcez, A. d.; and Lamb, L. C. 2023. Neurosymbolic AL The 3 rd wave. Artificial Intelligence Review, 56(11): 12387-12406.

[0167] Garcia, J.; and Fernandez, F. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(1): 1437-1480.

[0168] Giray, L. 2023. Prompt engineering with ChatGPT: a guide for academic writers. Annals of biomedical engineering, 51(12): 2629-2633.

[0169] Griffith, S.; Subramanian, K.; Scholz, J.; Isbell, C. L.; and Thomaz, A. L. 2013. Policy shaping: Integrating human feedback with reinforcement learning. Advances in neural information processing systems, 26.

[0170] Gu, S.; Yang, L.; Du, Y.; Chen, G.; Walter, F.; Wang, J.; and Knoll, A. 2022. A review of safe reinforcement learning: Methods, theory and applications. arXiv preprint arXiv:2205.10330.

[0171] Hong, Z.-W.; Shenfeld, I.; Wang, T.-H.; Chuang, Y.-S.; Pareja, A.; Glass, J.; Srivastava, A.; and Agrawal, P. 2024. Curiosity-driven red-teaming for large language models. arXiv preprint arXiv: 2402.19464.Attorney docket # 047162-5394-00WO

[0172] Huang, Q.; Liu, X ; Ko, T.; Wu, B.; Wang, W.; Zhang,

[0173] Y ; and Tang, L. 2024. Selective Prompting Tuning for Personalized Conversations with LLMs. arXiv preprint ar Xiv:2406.18187.

[0174] Hutto, C.; and Gilbert, E. 2014. Vader: A parsimonious rulebased model for sentiment analysis of social media text. In Proceedings of the International AAAI conference on Web and Social Media, volume 8, 216-225.

[0175] Laver, J. 1981. Linguistic routines and politeness in greeting and parting. Conversational Routine, 289304.

[0176] Leite, I ; Martinho, C.; and Paiva, A. 2013. Social robots for long-term interaction: a survey. International Journal of Social Robotics, 5: 291-308.

[0177] Lin, C.-C.; Huang, A. Y.; and Lu, O. H. 2023. Artificial intelligence in intelligent tutoring systems toward sustainable education: a systematic review. Smart Learning Environments, 10(1): 41.

[0178] Lyu, K.; Zhao, H.; Gu, X.; Yu, D.; Goyal, A.; and Arora, S. 2024. Keeping 11ms aligned after fine-tuning: The crucial role of prompt templates. arXiv preprint arXiv:2402.18540.

[0179] Mesko, B. 2023. Prompt engineering as an important emerging skill for medical professionals: tutorial. Journal of medical Internet research, 25: e50638.

[0180] Naveed, H.; Khan, A. U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; and Mian, A. 2023. A comprehensive overview of large language models. arXiv preprint arXiv:2307.06435.

[0181] Niknazar, M.; Haley, P. V.; Ramanan, L.; Truong, S. T.;

[0182] Shrinivasan, Y.; Bhowmick, A. K.; Dey, P.; Jagmohan, A.; Maheshwari, H.; Ponoth, S.; et al. 2024. Building a Domainspecific Guardrail Model in Production. arXiv preprint arXiv:2408.01452.

[0183] OpenAI. 2024. OpenAI ChatGPT API.Atorney docket # 047162-5394-00WO

[0184] Paul, S. K. 2024. Continually Learning Planning Agent for Large Environments guided by LLMs. In 2024 IEEE Conference on Artificial Intelligence (CAI), 377-382. IEEE.

[0185] Price, M.; Handley, K.; Millar, J.; and O’donovan, B. 2010. Feedback: all that effort, but what is the effect? Assessment & Evaluation in Higher Education, 35(3): 277-289.

[0186] Raman, S. S.; Cohen, V.; Rosen, E.; Idrees, I.; Paulius, D.; and Tellex, S. 2022. Planning with large language models via corrective re-prompting. In NeurlPS 2022 Foundation Models for Decision Making Workshop.

[0187] Shieber, S. M. 2004. The Turing test: Verbal behavior as the hallmark of intelligence. MIT Press.

[0188] Team, G.; Anil, R.; Borgeaud, S.; Wu, Y.; Alayrac, J.-B.; Yu, J.; Soricut, R.; Schalkwyk, J.; Dai, A. M.; Hauth, A.; et al. 2023. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.

[0189] Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.

[0190] Wang, K.; Lu, Y.; Santacroce, M.; Gong, Y.; Zhang, C.; and Shen, Y. 2023. Adapting 11m agents through communication. arXiv preprint arXiv:2310.01444.

[0191] White, J.; Fu, Q.; Hays, S.; Sandborn, M.; Olea, C.; Gilbert, H.; Elnashar, A.; Spencer- Smith, J.; and Schmidt, D. C. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382.

[0192] Zhou, Y.; Muresanu, A. I; Han, Z.; Paster, K.; Pitis, S.; Chan, H.; and Ba, J. 2022. Large language models are human-level prompt engineers. arXiv preprint ar Xiv:2211.01910.

Claims

Attorney docket # 047162-5394-00WOCLAIMSWhat is claimed is:

1. A computer-implemented conversation system, comprising: a base large language model configured to receive a prompt from a user and provide a base response; an observer large language model configured to evaluate the base response and provide a feedback prompt to the base large language model to produce a refined response; a sentiment analysis module configured to calculate a sentiment score of the refined response; a coherence analysis module configured to calculate a coherence score of the refined response; a formality analysis module configured to calculate a formality score of the refined response; and a controller configured to present the refined response to the user when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold.

2. The computer-implemented conversation system of claim 1, wherein the observer large language model is configured to provide the feedback prompt to summarize the base response when the base response exceeds a response length threshold.

3. The computer-implemented conversation system of claim 2, wherein the response length threshold is between 20 and 150 tokens.

4. The computer-implemented conversation system of claim 1, wherein the sentiment analysis module performs a VADER sentiment analysis.

5. The computer-implemented conversation system of claim 1, wherein the sentiment analysis score is between -1 and 1.Atorney docket # 047162-5394-00WO6. The computer-implemented conversation system of claim 1, wherein the coherence analysis module is configured to perform a BERT coherence analysis.

7. The computer-implemented conversation system of claim 1, wherein the coherence analysis module is configured to: encode the refined response to a sequence of tokens; derive a set of embeddings of the sequence of tokens; and calculate an entropy of the token embeddings.

8. The computer-implemented conversation system of claim 1, wherein the formality analysis module is configured to calculate a cosine similarity of the refined response to a set of assistance keywords.

9. The computer-implemented conversation system of claim 8, wherein the set of assistance keywords comprises at least one of “help,” “assist,” “information,” “support,” “guide,” “instruction,” “advice,” “solve,” “fix,” “clarify,” “resolve,” “solution”, “answer”, and “explain.”10. A conversation system, comprising: a robot comprising an audio input, an audio output, and a processor communicatively connected to the audio input and the audio output; a non-transitory computer-readable medium communicatively connected to the processor with instructions stored thereon, which when executed by the processor, perform steps comprising: recording an audio stream from the audio input; transcribing the audio stream to a text prompt; providing the text prompt to a base large language model to produce a base response; evaluating the base response with an observer large language model to provide a feedback prompt to the base large language model and produce a refined response; calculating a sentiment score of the refined response; calculating a coherence score of the refined response; calculating a formality score of the refined response;Atorney docket # 047162-5394-00WO approving the refined response when the sentiment score is above a sentiment threshold, when the coherence score is above a coherence threshold, and when a formality score is below a formality threshold; synthesizing an audio response from the approved refined response; and transmitting the audio response via the audio output.

11. The conversation system of claim 10, wherein the robot further comprises an image acquisition device, and wherein the instructions further comprise the step of calculating person detection and gaze detection on a user of the conversation system.

12. The conversation system of claim 10, wherein the step of evaluating the base response with the observer large language model comprises providing the feedback prompt to summarize the response to the base large language model when a length of the base response exceeds a threshold.

13. The conversation system of claim 10, further comprising the step of providing a second feedback prompt to the base large language model to restate the base response more positively when the sentiment score is below the sentiment threshold.

14. The conversation system of claim 10, further comprising the step of providing a second feedback prompt to the base large language model to restate the base response more casually when the formality score of the base response exceeds the formality threshold.

15. The conversation system of claim 10, further comprising the step of providing a second feedback prompt to the base large language model to restate the base response more clearly when the coherence score of the base response is below the coherence threshold.

16. The conversation system of claim 10, wherein the base large language model and the observer large language model use a same architecture.

17. A grounded observer system, comprising:Atorney docket # 047162-5394-00WO a base machine learning model configured to receive a base prompt and produce a base response; a feature extractor configured to extract contextual features from the base response; an overlay rules module having a set of overlay rules to apply to the contextual features, each overlay rule having a rigidity measure, the overlay rules module configured to produce a set of overlay descriptors; and a buffer comprising an observer machine learning model, configured to receive the overlay descriptors and the base response, and provide a feedback directive to the LLM based on the base response and the overlay descriptors, wherein the feedback directive is configured to alter the base response to better conform to at least one of the overlay rules.

18. The grounded observer system of claim 17, wherein when the rigidity measure is strict, the corresponding overlay rule is strictly applied, and wherein when the rigidity measure is lax, the corresponding overlay rule is flexibly applied.

19. The grounded observer system of claim 17, wherein the observer machine learning model and the base machine learning model are large language models.

20. The grounded observer system of claim 17, wherein at least one of the overlay rules comprises an “if this, then that” rule.