The invention discloses a multi-virtual power plant decentralized self-discipline optimization method. The method comprises the following steps: S1, initializing parameters; S2, classifying tasks andforming an initial knowledge matrix; S3, acquiring information; S4, determining an optimization individual action; S5, calculating an objective function value of each agent; S6, calculating a rewardfunction; S7, updating the knowledge matrix; S8, information feedback: each agent returns the current optimal solution to an information center; and S9, judging whether the maximum number of iterations is reached or not, and if yes, outputting an optimal knowledge matrix of the corresponding task, otherwise, returning to S3. By applying the multi-virtual power plant decentralized self-discipline optimization method, the technical problems that the existing distribution network regulation and control cannot meet the requirement that a plurality of virtual power plants participate in the power market in real time for profit-by-profit, and the grid connection behavior of distributed equipment is effectively controlled to support the safe and effective operation of the distribution network aresolved.