Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

497 results about "Return function" patented technology

Computer gaming system

The present invention comprises an intelligent gaming system that includes a game engine, simulation engine, and, in certain embodiments, a static evaluator. Embodiments of the invention include an intelligent, poker playing slot machine that allows a user to play poker for money against one or more intelligent, simulated opponents. In one embodiment, the invention generates card playing strategies by analyzing the expected return to players of a game. In one embodiment, a multi-dimensional model is used to represent possible strategies that may be used by each player participating in a card game. Each axis (dimension) of the model represents a distribution of a player's possible hands. Points along a player's distribution axis divide each axis into a number of segments. Each segment has associated with it an action sequence to be undertaken by the player with hands that fall within the segment. The dividing points delineate dividing points between different action sequences. The model is divided into separate portions each corresponding to an outcome determined by the action sequences and hand strengths for each player applicable to the portion. An expected return expression is generated by multiplying the outcome for each portion by the size of the portion, and adding together the resulting products. The location of the dividing points that result in the maximum expected return is determined by taking partial derivatives of the expected return function with respect to each variable, and setting them equal to zero. The result is a set of simultaneous equations that are solved to obtain values for each dividing point. The values for the optimized dividing points define optimized card playing strategies.
Owner:GAMECRAFT

Traffic signal self-adaptive control method based on deep reinforcement learning

InactiveCN106910351ARealize precise perceptionSolve the problem of inaccurate perception of traffic statusControlling traffic signalsNeural architecturesTraffic signalReturn function
The invention relates to the technical field of traffic control and artificial intelligence and provides a traffic signal self-adaptive control method based on deep reinforcement learning. The method includes the following steps that 1, a traffic signal control agent, a state space S, a motion space A and a return function r are defined; 2, a deep neutral network is pre-trained; 3, the neutral network is trained through a deep reinforcement learning method; 4, traffic signal control is carried out according to the trained deep neutral network. By preprocessing traffic data acquired by magnetic induction, video, an RFID, vehicle internet and the like, low-layer expression of the traffic state containing vehicle position information is obtained; then the traffic state is perceived through a multilayer perceptron of deep learning, and high-layer abstract features of the current traffic state are obtained; on the basis, a proper timing plan is selected according to the high-layer abstract features of the current traffic state through the decision making capacity of reinforcement learning, self-adaptive control of traffic signals is achieved, the vehicle travel time is shortened accordingly, and safe, smooth, orderly and efficient operation of traffic is guaranteed.
Owner:DALIAN UNIV OF TECH

Wireless network resource allocation method based on deep reinforcement learning

The invention provides a wireless network resource allocation method based on deep reinforcement learning. The energy efficiency in time-varying channel environment can be maximally improved with relatively low complexity. The method comprises the following steps: establishing a deep reinforcement learning model; modeling the time-varying channel environment between a base station and a user terminal as a time-varying Markov channel in a finite state, determining a normalization channel coefficient, and inputting a convolution neural network qeval, selecting an action with the maximum output return value as a decision action, and allocating the sub-carrier for the user; allocating downlink power for the user reusing on each subcarrier based on the inverse ratio of the channel coefficient according to a subcarrier allocation result, and determining a return function based on the allocated downlink power, and feeding back the return function to the deep reinforcement learning model; andtraining the convolution neural network qeval and qtarget in the deep reinforcement learning model according to the determined return function, and determining the power local optimal allocation underthe time-varying channel environment. The wireless network resource allocation method provided by the invention relates to the field of the wireless communication and artificial intelligence decision.
Owner:UNIV OF SCI & TECH BEIJING

Intelligent control method for vertical recovery of carrier rockets based on deep reinforcement learning

An intelligent control method for vertical recovery of carrier rockets based on deep reinforcement learning is disclosed. A method of autonomous intelligent control for carrier rockets is studied. Theinvention mainly studies how to realize attitude control and path planning for vertical recovery of carrier rockets by using intelligent control. For the aerospace industry, the autonomous intellectualization of spacecrafts is undoubtedly of great significance whether in the saving of labor cost or in the reduction of human errors. A carrier rocket vertical recovery simulation model is established, and a corresponding Markov decision-making process, including a state space, an action space, a state transition equation and a return function, is established. The mapping relationship between environment and agent behavior is fitted by using a neural network, and the neural network is trained so that a carrier rocket can be recovered autonomously and controllably by using the trained neural network. The project not only can provide technical support for the spacecraft orbit intelligent planning technology, but also can provide a simulation and verification platform for attack-defense confrontation between spacecrafts based on deep reinforcement learning.
Owner:BEIJING AEROSPACE AUTOMATIC CONTROL RES INST +1

Device, system and method for repairing underground pipeline of trenchless pneumatic cracking pipe

The embodiment of the invention provides a device, a system and a method for repairing an underground pipeline of a trenchless pneumatic cracking pipe. The system comprises an existing well and a working pit, wherein the existing well is provided with a pulley group and a winch; the winch is connected with a cracking pipe head through a traction device; the traction device is provided with front and back steering connectors, a pin, an adjusting device and a bolt; and the working pit is externally provided with an air compressor to drive a pipe tampering machine. The cracking pipe head is provided with a tapered head and a tail part for connecting with a replacing pipe, and is provided with a through hole; the middle line of the through hole is overlapped with the middle line of the tapered head; a front columnar through hole is used for mounting the front steering connector; a tail tapered through hole is used for connecting the pipe tampering machine; the cracking pipe head is provided with the tapered head and an expansion cutter is arranged on the outer surface; and the outer diameter of the cracking pipe head formed by the expansion cutter is greater than the inner pipe of a waste pipe to propel the expanded and cracked waste pipe. The whole process ingeniously utilizes the returning function of the pipe tampering machine under the condition that one pit corresponds to one well, and a receiving pit is replaced by the existing well. Therefore, the digging amount can be reduced so that the working efficiency is improved and the cost is saved.
Owner:北京隆科兴科技集团股份有限公司

Frequency spectrum auction method of two-layer heterogeneous network containing small cells

The invention provides a frequency spectrum auction method of a two-layer heterogeneous network containing small cells. In combination with a macro cellular network and numerous small cells within a radiation range, a frequency spectrum allocation auction model is established to divide an authorized frequency band provided by a macro base station into a plurality of time slots for auction as commodities, a user return function is set, an auction target function is established according to auction bid, and a maximum return slot allocation method is obtained through the optimization theory according to the bid. The frequency spectrum allocation auction model is auction technology based on a VCG mechanism that performs allocation by dividing the time slots. The frequency spectrum auction method has the beneficial technical effects as follows: (1) the auction mode is used for resource allocation; 2) the frequency spectrum resource is divided into time slots; and 3) the user bits are associated with the gains obtained by the small cells by purchasing the frequency band communication. By adoption of the frequency spectrum auction method provided by the invention, the effective allocation of the mobile communication resources is realized, an excellent frequency spectrum resource allocation mechanism is provided, the utilization efficiency of the spectrum resources is improved, and thus the application prospect is broad.
Owner:GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products