Method and apparatus for anonymizing re-identification data in a visual tracking system

By using an anonymization layer and tracking permission tokens to process re-identified data in the visual tracking system, the problems of data propagation and range definition in the visual tracking system are solved, and the visual tracking effect of anonymization and range limitation is achieved.

CN120182332BActive Publication Date: 2026-06-19AXIS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AXIS
Filing Date
2024-12-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In visual tracking systems, existing technologies raise concerns about the spread of re-identification data, leading to personal data leaks and unauthorized tracking, and making it difficult to effectively define the scope of visual tracking in terms of space and time.

Method used

An anonymization layer is used to process the re-identified data through a one-way function and a tracking permission token to generate anonymized data specific to the image source group. This data is then used in the tracking client for visual tracking, ensuring data anonymity and scope limitation.

Benefits of technology

It enables the anonymization of re-identification data in visual tracking systems, preventing access to feature vectors and unauthorized tracking, ensuring that data is used only within the FOV of a specified image source, protecting privacy and limiting the tracking range.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120182332B_ABST
    Figure CN120182332B_ABST
Patent Text Reader

Abstract

A method and apparatus for anonymizing re-identification data in a visual tracking system are disclosed. The method includes: detecting (210) sub-regions containing a tracking target in images obtained from multiple image sources; calculating (211) a feature vector representing the visual appearance of the tracking target in each sub-region; providing (212.1 / 212.2) a first re-identification data item / second re-identification data item by anonymizing each feature vector using a predefined one-way function modified by a first tracking permission token / second tracking permission token for a first subgroup / second subgroup of image sources, and exposing (213.1 / 213.2) the first re-identification data item / second re-identification data item marked with the location of the corresponding sub-region to a first tracking client / second tracking client, wherein the second tracking permission token is different from the first tracking permission token; and preventing (214) access to the feature vector.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of visual tracking systems. In particular, this disclosure proposes a novel method and apparatus for anonymizing re-identification data in a visual tracking system, and a novel method and apparatus for visual tracking using anonymized re-identification data. Background Technology

[0002] Recent technological advancements in object re-identification have provided more powerful methods for visually tracking targets such as people or objects within the field of view (FOV) of a single camera or across the FOVs of several cameras. This is extremely useful in applications such as object search and loitering detection. Following an established and widely practiced paradigm, visual tracking is not performed directly on image data, but rather on so-called re-identification data (reID data). As reID data, feature vectors, each representing the visual appearance of the target being tracked, are typically used. These feature vectors can be described as a low-dimensional representation of the image data, which saves on the amount of data to be processed and allows for precise matching.

[0003] Visual tracking can be performed by a tracking client separate from the camera that acquires the raw image data. In this case, image data must be transmitted from the camera to the tracking client via a data connection, raising concerns about the unwanted dissemination of personal data (e.g., through eavesdropping). If visual tracking is to be performed across the field of view (FOV) of several cameras, the transmission of image data is often unavoidable. The same concerns about dissemination are valid if the camera supplies reID data to the tracking client instead of image data, as reID data could enable unauthorized tracking of previously identified targets.

[0004] Given these concerns, it is desirable to preprocess or modify re-identification data at the source to define the scope of visual tracking relative to space and / or time. As an example of spatial delimitation, a system owner might want to specify that re-identification data from an image source is only used for visual tracking within the FOV of that image source.

[0005] In another example, the system owner may agree to combine this re-identification data with re-identification data from a second image source, but not with re-identification data from a third image source; in other words, the permitted use of re-identification data spans the FOV of both the first and second image sources. In the event that an unauthorized party (eavesdropper) gains access to the re-identification data, the technical mechanism should cause its tracking attempts to fail. Summary of the Invention

[0006] One object of this disclosure is to provide a method and apparatus for defining the scope of visual tracking relative to space and / or time. (Here, if the scope is limited to multiple specified image sources, the scope can be said to be defined relative to space even if the spatial location of the image sources is unknown.) A further object is to provide a method and apparatus for anonymizing the data on which visual tracking is based. A further object is to provide re-identification data that supports re-identification only in the FOV of a specific group of image sources. A further object is to provide a method and apparatus for protecting anonymity. Still a further object is to provide a method and apparatus for performing visual tracking using anonymized data.

[0007] At least some of these objectives are achieved by means of the invention as defined in the independent claims. The dependent claims relate to advantageous embodiments of the invention.

[0008] In a first aspect of this disclosure, a visual tracking system having the following main components is provided:

[0009] Multiple image sources,

[0010] Anonymization layer, and

[0011] Multiple tracking clients.

[0012] Image sources are configured to provide images within their respective fields of view (FOV), wherein each image source is configured to detect a respective sub-region m containing the tracked target in the image obtained from the image source. i Furthermore, for each sub-region, a feature vector f(m) representing the visual appearance of the tracked target within it is calculated. i Furthermore, the anonymization layer is implemented in processing circuitry separate from the tracking client, such as in processing circuitry located in the same location as the corresponding image source in the image source and / or in processing circuitry belonging to a coordinating entity within the visual tracking system. The anonymization layer is configured to provide a first-level identification data item g by anonymizing each feature vector using a predefined one-way function h modified by the first tracking authority token σ1. i,1 =h([f(m) i The first tracking permission token σ1 is specific to the first group of image sources. The anonymization layer is further configured to provide a second identification data item g by anonymizing each feature vector using a predefined one-way function h, now modified by the second tracking permission token σ2. i,2 =h([f(m) i The second tracking permission token σ2 is specific to the second set of images from the image source and is different from the first tracking permission token; σ2 ≠ σ1. The anonymization layer is further configured to expose the corresponding sub-region m labeled to the tracking client. i Position X(m)i The tracking client is configured to perform re-identification of the tracked target using data obtained from the anonymization layer. At least one of the image sources in the first group of image sources is included. The second group of image sources also includes at least one image source. Finally, the tracking client is configured to perform re-identification of the tracked target using data obtained from the anonymization layer. Among the tracking clients, at least one is a first tracking client authorized to perform re-identification in the FOV of the first group of image sources and acquire the first re-identification data item with location annotations, and a second tracking client authorized to perform re-identification in the FOV of the second group of image sources and acquire the second re-identification data item with location annotations. The tracking client can acquire the re-identification data items by receiving them on an internal or external data connection, or by retrieving them from shared memory.

[0013] Advantageously, because the anonymization layer prevents access to the feature vector (and the anonymization layer is separate from the tracking client), and because the first and second re-identification data items are generated using different tracking permission tokens σ1, σ2 specific to the first group / second group of the corresponding image source, each tracking client is restricted to performing re-identification within the FOV of its corresponding image source group. If we assume the first tracking client attempts to perform re-identification in a set containing both the first and second re-identification data items, then the first tracking client will never find a match between one of the first and second re-identification data items. Thanks to the conflict-free nature of the one-way function, the uniqueness of the tracking permission tokens σ1 ≠ σ2 will ensure that even when the function h is applied to the same feature vector f(m0).

[0014] g 0,1 =h([f(m0),σ1])≠h([f(m0),σ2])=g 0,2 (1)

[0015] Therefore, it is technically meaningless for the first tracking client to attempt to go beyond the first set of FOVs of the image source. For general re-identification data (e.g., assuming the original feature vectors have been used as re-identification data), there is no inherent mechanism preventing the first tracking client from performing re-identification outside the first set of FOVs of the image source.

[0016] Again, due to the unidirectional nature of the function h, the eigenvector f(m) i The data item g is anonymized. Downstream parties to the anonymization layer cannot re-identify the data item g. i,1 ,g i,2 One of the features is to reconstruct the feature vector f(m) i Even with full access to a large set of re-identification data, reversing the function h is computationally infeasible.

[0017] In a second aspect of this disclosure, a method is provided for providing anonymized data to facilitate re-identification in a visual tracking system. The method includes: detecting, in images obtained from multiple image sources, respective sub-regions m containing the tracked target. i For each sub-region, calculate the feature vector f(m) representing the visual appearance of the tracked target within it. i For the first subgroup of image sources, a first re-identification data item g is provided by anonymizing each feature vector using a predefined one-way function h modified by the first tracking permission token σ1. i,1 =h([f(m) i ),σ1]); and publicly discloses the location X(m) of the corresponding sub-region to the first tracking client. i The first layer of identification data item. For the second subgroup of the image source, the second layer of identification data item g is provided by anonymizing each feature vector using a predefined one-way function h modified by a second tracking permission token σ2 different from the first tracking permission token. i,2 =h([f(m) i ),σ2]); and publicly discloses the location X(m) of the corresponding sub-region to the second tracking client. i The second layer of identification data items. The method further includes preventing access to the feature vector.

[0018] The method according to the second aspect facilitates (i.e., assists, supports, enables) re-identification in a visual tracking system because it provides re-identification data items, which the tracking client searches for matches to track the target. As explained above, the proposed method for generating re-identification data items also ensures anonymity. Furthermore, thanks to the uniqueness of the tracking permission token, the re-identification data items are generated in a way that defines the scope of visual tracking to a specified subgroup of the image source. Multiple instances of the method according to the second aspect can be executed such that the first tracking permission token σ1 is specific not only to the first subgroup of the image source but also to the first group of image sources that includes the first subgroup, and / or the second tracking permission token σ2 is specific not only to the second subgroup of the image source but also to the second group of image sources that includes the second subgroup. For example, the first group of image sources may consist of a first subgroup and a further subgroup, with independent processes using the same first tracking permission token σ1 to generate re-identification data items for that further subgroup.

[0019] The second aspect of the method involves steps that can be performed in the anonymization layer, and also steps that can be performed in the image source. In the implementation of the method, the detection sub-region m can be... i And calculate the feature vector f(m) for each sub-region i The steps of [the above steps] are delegated to the image source. Accordingly, the anonymization layer does not need to perform more than the following steps:

[0020] - For the first subgroup of the image source, a first identification data item is provided by anonymizing each feature vector using a predefined one-way function h modified by the first tracking permission token; and the first identification data item, labeled with the location of the corresponding sub-region, is exposed to the first tracking client.

[0021] - For the second subgroup of the image source, a second identification data item is provided by anonymizing each feature vector using a function h modified by the second tracking permission token; and the second identification data item, labeled with the location of the corresponding sub-region, is exposed to the second tracking client.

[0022] - Prevents access to feature vectors.

[0023] It should be understood that the anonymization layer may have initially obtained feature vectors computed for sub-regions in images from multiple image sources, where each sub-region contains the tracking target, and the feature vectors represent the visual appearance of the tracking target within them.

[0024] According to a third aspect of this disclosure, a method is provided for visually tracking a target in a first set of fields of view of an image source. The method includes: receiving a first recognition data item g. i,1 The process involves searching for matching re-identification data items within the first set of identification data items, and for a set of mutually matching re-identification data items, tracking a target based on the positions marked by the mutually matching re-identification data items. According to the third aspect, each first set of identification data item originates from the feature vector f(m). i The eigenvector f(m) i ) represents the visual appearance of the tracked target in a sub-region of an image obtained from an image source in the first group, and each first-level identification data item is labeled with the location of the corresponding sub-region. Furthermore, all first-level identification data items are computed using a predefined one-way function h modified by a first tracking permission token σ1 specific to the first group of image sources.

[0025] Although the re-identification data is provided in the form of anonymized data, the method according to the third aspect still enables visual tracking. In particular, the method may include searching for matching re-identification data items among a set of re-identification data items already computed using a single predefined one-way function h modified by a single tracking permission token σ1. (As explained above, the input to this computation is a feature vector, each representing the visual appearance of the detected tracked target.) That is, the method excludes searching for matching re-identification data items among re-identification data items computed using different one-way functions and / or among re-identification data items computed using one-way functions modified by two or more different tracking permission tokens.

[0026] In the fourth and fifth aspects of this disclosure, an apparatus or cluster of apparatus is provided that includes processing circuitry configured to perform the method according to the second aspect (providing anonymized data to facilitate re-identification) or the method according to the third aspect (performing visual tracking).

[0027] This disclosure further proposes a computer program containing instructions for causing a computer to perform the methods described above. The computer program may be stored or distributed on a data carrier. As used herein, "data carrier" may be a transient data carrier (such as modulated electromagnetic waves or light waves) or a non-transient data carrier. Non-transient data carriers include volatile and non-volatile memories such as permanent and non-permanent storage media of the magnetic, optical, or solid-state types. Still within the scope of "data carrier," such memory may be permanently mounted or portable.

[0028] Some implementations also define the scope of visual tracking relative to time, i.e., after the expiration of the first tracking permission token, the first tracking permission token σ1 is no longer used (in the anonymization layer). Thereafter, the anonymization layer can anonymize the feature vector using a predefined one-way function h modified by the replacement first tracking permission token σ′1. Therefore, because the one-way function provides conflict-free output values, the first tracking client will never find a match between the re-identified data items generated before and after the token replacement, even for the same feature vector (i.e., similar to (1), h([f(m0),σ1])≠h([f(m0),σ′1])).

[0029] To define the scope of visual tracking only relative to time, the same tracking permission token is used to modify the one-way function h for all image sources in the visual tracking system. When the tracking permission token expires, the one-way function h is instead modified by the tracking permission token (still for all image sources in the visual tracking system). This allows the tracking client to perform re-identification within the FOV of all image sources in the visual tracking system, but only once per validity period.

[0030] For the purposes of this disclosure, the term "reidentification data" (or reID data) refers to the quantity or variable that forms the basis of a reidentification process, which is a matching process that identifies image data from different times and locations as pointing to the same tracking target (such as a person or object). ReID data can be a proxy for actual image data (such as feature vectors derived from image data). In an individualized form, the feature vector can be referred to as a reID data item. ReID data can be provided in the form of non-anonymized data (e.g., feature vectors) or anonymized data (e.g., hash values ​​of feature vectors). According to the first, second, and further aspects of this disclosure, reID data should be provided as anonymized data. Hashing and other types of anonymization processes may result in changes to the data type; for example, the hash value of a vector can be a scalar, although there are special hashing techniques that return vectors.

[0031] As used in this article, subregion m i The "position" X(m) i The location (m) can refer to the position of a sub-region within an image or its geographic location. The geographic location corresponds to the location of the image source from which the image was acquired, and this location is independent of the sub-region's position within the image. Furthermore, the geographic location can be the sub-region m. i The approximate location of the FOV portion of the image source depicted.

[0032] Generally, all terms used in the claims should be interpreted according to their ordinary meaning in the technical field, unless otherwise expressly defined herein. Unless otherwise expressly stated, all references to “a”, “an”, “the” element, device, component, method, step, etc., should be interpreted openly as referring to at least one instance of an element, device, component, method, step, etc. Unless expressly stated otherwise, the steps of any method described herein need not be performed in the precise order disclosed. Attached Figure Description

[0033] Aspects and implementation methods will now be described by way of example with reference to the accompanying drawings, wherein:

[0034] Figure 1A and Figure 1B Two visual tracking systems are shown;

[0035] Figure 2 A flowchart of a method for providing anonymized data to facilitate re-identification in a visual tracking system;

[0036] Figure 3 A flowchart of a method for visual tracking of targets based on anonymized re-identification data; and

[0037] Figure 4This explains how to generate anonymized re-identification data from acquired images. Detailed Implementation

[0038] Aspects of this disclosure will now be described more fully below with reference to the accompanying drawings, in which certain embodiments of the invention are illustrated. However, these aspects may be embodied in many different forms and should not be construed as limiting; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of all aspects of the invention to those skilled in the art. Throughout the specification, the same numerals refer to the same elements.

[0039] Figure 1A A visual tracking system 100 suitable for visually tracking targets P1, P2, P3 is shown. The tracking targets P1, P2, P3 can be humans or inanimate objects. Figure 1A The example illustrates a vehicle P1 and two distinct individuals P2 and P3. Conceptually, visual tracking technology can be said to be based on the assumption that a tracked target can be identified with high precision based on unique or substantially unique visual features within the tracking range. In the case of a person, unique visual features may include height, posture, body shape, facial features, and combinations of clothing and footwear. In the case of a vehicle, the license plate may be the preferred feature. For each tracked target, a possible output of visual tracking could be a list of locations and an indication of when the tracked target appeared in those locations.

[0040] Figure 1A The visual tracking system 100 includes four image sources 120.1, 120.2, 120.3, 120.4 arranged to provide images of corresponding FOVs 129.1, 129.2, 129.3, 129.4 and at least two tracking clients 140.1, 140.2. Each tracking client 140 includes processing circuitry 141 and memory 142, wherein the additional memory 142 is adapted to store a computer program 143 executable by the processing circuitry 141, as well as input and output data for the re-identification process, and other objects. The visual tracking system 100 further includes an anonymization layer 130; the anonymization layer 130 is a virtual entity implemented in a coordination entity 150. The coordination entity 150 may be a centralized hardware component or a centralized software process in the visual tracking system 100. The coordination entity 150 may be implemented in a dedicated server or host in a computer network, or implemented as a process executed on a computer or virtual machine with additional responsibilities in the visual tracking system 100. The coordinating entity 150 operates on the processing circuit 151 and has a memory 152 at its disposal; the functions of the coordinating entity 150 can be encoded into a computer program 153.

[0041] Image source 120, coordinating entity 150, and tracking client 140 are linked via a wired or wireless data connection (e.g., by connection to a public data network). As will be explained below, the data connection from image source 120 to coordinating entity 150 should preferably be protected against unauthorized attacks because they transmit feature vectors that have not yet undergone anonymization. (These data connections are also sensitive in such embodiments of visual tracking system 100 (where feature vectors are computed in coordinating entity 150), i.e., because the data connection transmits image data.) The data connection can be protected from eavesdropping and other attacks by appropriate end-to-end encryption, tunneling, or by allowing the data connection only to pass through physically protected, reasonably intrusion-resistant wired lines (such as an internal device bus).

[0042] exist Figure 1A In the visual tracking system 100, the novel technique proposed herein is used to spatially define the scope of visual tracking, and more specifically, to allow a first tracking client 140.1 to perform visual tracking within a first set of FOVs 110.1 including a first image source 120.1 and a second image source 120.2, and to allow a second tracking client 140.2 to perform visual tracking within a second set of FOVs 110.2 including a third image source 120.3 and a fourth image source 120.4. For this purpose, a first re-identification data item g with location annotations is supplied to the first tracking client 140.1. 1,1 ,g 2,1 ,g 3,1 ... and supply the second identification data item g with location annotation to the second tracking client 140.2. 1,2 ,g 2,2 ,g 3,2 The first group 110.1 of the image source and the second group 112.2 of the image source may not intersect, or they may at least partially overlap.

[0043] Without departing from the scope of this disclosure, re-identification data originating from the same image source group can be provided to more than one tracking client. For example... Figure 1A As indicated by the dashed box in the upper right corner, the third tracking client 140.3 and the fourth tracking client 140.4 can perform re-identification in the FOV of the second group 110.2 of the image source based on the second re-identification data item with location annotation.

[0044] The functions of the different components of the visual tracking system 100 will now be described in more detail. Image sources 120.1, 120.2, 120.3, and 120.4 include lenses, photosensitive components (image sensors), and image processing components, through which they provide digital image data representing still images or video sequences of their respective fields of view (FOV) 129.1, 129.2, 129.3, and 129.4. Image source 120 may be, for example, a digital video camera.

[0045] exist Figure 1A In the example shown, each image source 120 further includes processing circuitry 121 configured to detect sub-regions m1, m2, ..., m6 in the image M such that each sub-region contains a tracking target. Figure 4 The upper part refers to an example of applying this detection process to four images M; visually, it can be observed that sub-regions m1, m4, m6 contain the same people at different locations (and thus, at different times), sub-region m5 contains different people, and sub-regions m2, m3 contain the same vehicles. Detection can be based on object detection algorithms or object classification algorithms, such as body part detection algorithms, face detection algorithms, or, in the case of vehicles, algorithms configured to detect alphanumeric characters or visual features of vehicles (e.g., on license plates). In different implementations, the algorithm can be configured to output bounding boxes, or separate bounding box post-processing can be performed downstream of the algorithm in other ways to obtain sub-regions m1, m2, ..., m6. In terms of symbolic representation, each sub-region m... i Preferably, it has a globally unique index i. The index can be a single index from a sequence shared by all image sources. Alternatively, it can be a combination of multiple sub-indexes (e.g., i = (i1, i2, i3)), where the corresponding sub-index refers to the sequence of the image source (i1), the image sequence captured by each image source (i2), and / or the sub-region sequence within each image (i3).

[0046] Figure 1A The processing circuitry 121 of the image source 120 is further configured to calculate, for each sub-region m1, m2, ..., m6, a corresponding feature vector (or appearance vector) denoted as f(m1), f(m2), ..., f(m6), representing the visual appearance of one or more tracked targets in these sub-regions m1, m2, ..., m6. The first feature vector may, for example, be a numeric vector with known or unknown meaning in the feature space, such as f(m1) = (9, 8, 7, 6, 5, 4, 3, 2, 1, 0). Alternatively, the feature vector may be represented as a string, such as a string in the base64 alphabet of the following 43 characters:

[0047] aAPjFcw7bOw74uD4GEVdpo5v0-Sal1eoguVkYuwc1srRbb0OFhLOVOwUPA,

[0048] It represents 128 bytes, with each byte being 3 bits. In existing technologies, feature vectors, along with an appropriate distance metric, form the basis for matching during the re-identification process; that is, evaluating whether two feature vectors are close enough relative to the distance metric that they should be considered as being associated with the same tracking target. Various people re-identification methods within this framework are reviewed in the following references:

[0049] • M. Ye et al., Deep Learning for Person Re-identification: Survey and Prospects, arXiv preprint, arXiv:2001.04193(2021);

[0050] • Zahra et al., “Person Re-identification: A Review of Domain-Specific Open Challenges and Future Trends,” Pattern Recognition, Vol. 142 (2023), 109669, DOI: 10.1016 / j.patcog.2023.109669; and

[0051] ·L. Zheng et al., Person Reidentification: Past, Present and Future, arXiv preprint, arXiv:161002984 (2016).

[0052] The following will explain how these results can be used as the basis for re-identification in the specific visual tracking system 100 described in this paper.

[0053] In a broad sense, a feature vector can be described as a low-dimensional representation of the visual appearance of a target being tracked. A feature vector can be, for example, represented as a license plate number consisting of alphanumeric characters read from a vehicle's license plate image (e.g., f(m1) = 'ABC123'). In this example, according to existing re-identification techniques, matching might require all characters to be equal; otherwise, the vehicle would not be identified as identical.

[0054] In another example, feature vectors are computed based on transparent (“handcrafted,” human-defined) definitions (such as color- or texture-based characteristics (e.g., weighted color histograms, maximally stable color regions, recurring highly structured patches)) or attribute-based characteristics (e.g., clothing, biometrics). This definition of feature vectors is independent of whether the target has an artificial label consisting of a license plate. The computation of feature vectors can further consider aspects such as appearance from multiple viewpoints or the target's motion patterns that can be derived from video data. Feature vectors can be low-dimensional in the sense that they consist of a small number of components and / or each component (e.g., by rounding real numbers to integers) takes values ​​from a set with a finite cardinality. In this case, according to prior art re-identification techniques, two feature vectors x = (x1, x2, ..., x...) N ),y=(y1,y2,…,y N A match can correspond to the complete equality of vector components:

[0055] x n =y n ,1≤n≤N, (2)

[0056] Alternatively, it can correspond to separation relative to a distance metric up to a multiple threshold D0.

[0057] d(x,y) <D0 (3)

[0058] The distance metric in (3) can, for example, correspond to l p distance:

[0059]

[0060] Here, p>0 is a predefined constant. Furthermore, the feature vector can be compared to a distance metric d(x,y) computed by a machine learning model trained to mimic correct or human-like re-identification decisions; however, a closed-form definition of the distance metric is neither necessary nor required.

[0061] In the third example, the feature vectors are computed by a machine learning model, such as a convolutional neural network (CNN), trained to mimic correct or human-like re-identification decisions. Training can include supervised or unsupervised learning. When feature vectors are computed in this way, the computations do not follow any transparent definition (which is not necessary for the teachings in this paper in any case), and their results are generally not (e.g., based on the appearance of the tracked object) explicitly interpretable. Furthermore, the computations can only be faithfully repeated by the machine learning model itself. Depending on how the machine learning model is trained, feature vectors computed in this way can be compared relative to a closed-form distance metric (like Equations (2) and (3)) or relative to a distance metric computed by the machine learning model.

[0062] exist Figure 4 In the accompanying drawing, reference numeral 401 indicates the process of providing feature vectors f(m1), f(m2), ..., f(m6) based on image M. Figure 1A In the example visual tracking system 100 illustrated herein, an instance of this process 401 runs in the corresponding image source 120 for acquiring image M. In other embodiments, process 401 may run in the coordinating entity 150 of the visual tracking system 100.

[0063] Further process 402 determines the positions X(m1), X(m2), ..., X(m6) of subregions m1, m2, ..., m6. Each position can refer to (e.g., expressed in pixel coordinates or as a percentage of the FOV size) subregion m i The location within image M, or it may be a geographic location (e.g., represented in a global reference frame such as WGS84). For clarity, if the location is represented in pixel coordinates, local geographic coordinates, or other local coordinates, the location should preferably include a direct or indirect identifier of the image source. The geographic location may correspond to the (fixed) location of the image source 120 from which the image is acquired, independent of the location of the sub-region within the image. Further, the geographic location may be the sub-region m i The approximate location of the FOV portion of the depicted image source. Therefore, at least for location X(m) i Some definitions of the locations X(m1), X(m2), ..., X(m6) can be implemented in process 402 in image source 120 or in another component of visual tracking system 100 (such as in coordination entity 150), provided that the other component has access to relevant information (such as the geographic location of image source 120 or the geographic location of different parts of the FOV of image source 120, which may have been pre-estimated). These locations X(m1), X(m2), ..., X(m6) can be used as components of anonymization layer 130.

[0064] Anonymization layer

[0065] Now according to Figure 2 The flowchart in section 200 illustrates the function of the anonymization layer 130. Initial steps 210 and 211 correspond to... Figure 4 Processes 401 and 402 are described above. As mentioned, they can be performed in image source 120 (or at the same location), in coordinating entity 150, or even in different components of visual tracking system 100. Initial steps 210 and 211 can be performed in or upstream of anonymization layer 130. Therefore, in some embodiments, anonymization layer 130 can be configured to perform only steps 212, 213, and 214. Figure 1A The architecture shown, where the anonymization layer is centralized in the coordinating entity 150, avoids the need to distribute tracking permission tokens to multiple parallel processes and / or simplifies the replacement of tracking permission tokens upon expiration. It should be understood that when steps 210 and 211 are performed in image source 120 (or at the same location as image source 120), they may correspond to multiple processes executed in parallel; for example, there may be a detection sub-region m. i And calculate the feature vector f(m) for each image source. i The process of ).

[0066] Steps 212 and 213 are performed once for each subgroup 111 of the image source, and can therefore be repeated for two or more subgroups 111. A different tracking permission token should be used for each execution of steps 212 and 213. The anonymization layer 130 is not implemented in the processing circuitry 151 of the coordinating entity 150, but is distributed across multiple processors 121.1, 121.3, and 121.4 located in the same location as the corresponding image source in the image sources 120.1, 120.3, and 120.4 (see [link to documentation]). Figure 1B In such an implementation, it is possible for two instances of method 200 (executed on two different processors) to use the same tracking permission token σ1 relative to two different subgroups 111.1 and 111.2 of the image source. In this setting, the tracking permission token σ1 can be considered to be specific to the image source group 110.1, which is the union of the two subgroups 111.1 and 111.2.

[0067] For the first subgroup 111.1 of the image source, step 212 includes providing a first re-identification data item 212.1 by anonymizing each feature vector using a predefined one-way function h, wherein the one-way function h is modified by a first tracking permission token σ1, as follows:

[0068] g i,1 =h([f(m) i ),σ1]), i∈I1, (4)

[0069] Where I1 is the first index set. The tracking permission token σ1 can be represented as a bit string. The symbols [·,·] refer to combination operations such as string concatenation. The quantity (4) can have the appearance of a single number, a bit string, an alphanumeric string, or a vector, etc.

[0070] Assume the one-way function is irreversible and there are no obvious collisions. As a one-way function, a hash function can be used, especially a cryptographic hash function considered to provide a sufficient level of security given the sensitivity of image data. Examples include SHA-256, SHA3-512, RSA-1024, and possibly MD5. When applied as shown in (4), the tracking permission token can be considered as a cryptographic salt acting as the one-way function; conceptually, it defines a modified one-way function H(x) = h([x,σ1]). Because of the irreversibility of the one-way function h—even knowing the tracking permission token σ1—it is not always necessary to keep the tracking permission token σ1 as a secret. Furthermore, the re-identified data item g... 1,1 ,g 2,1 ,g 3,1 The recipient of the data item (such as the tracking client 140) may use the re-identified data item without knowing the value of the tracking permission token σ1.

[0071] As explained above, feature vectors are examples of re-identification data that can be used in the re-identification process. Several useful definitions of feature vectors are known in the art, and based on condition (3), they can be classified based on whether a match is needed according to equation (2) or whether non-zero separations less than the threshold D0 should also be accepted as matches. Re-identification data item g i,1 It is also a type of re-identification data that can be used in re-identification processes that are basically in line with the latest technologies, meaning that no drastic adjustments are required at the receiving end.

[0072] If the underlying feature vector is defined as requiring equation (2), then the receiver can directly use the re-identification data item g. 1,1 ,g 2,1 ,g 3,1 This is because, thanks to the conflict-free nature of h, equation g... j,1 =g j′,1 The hidden underlying feature vectors are also equal, f(m) j )=f(m j′ ).

[0073] If instead the underlying feature vectors are defined such that non-zero separations less than a threshold D0 are accepted as matches (condition (3)), then a proximity-preserving one-way function h (proximity-preserving hash function) should be used. A function h is proximity-preserving if for all feature vectors x, y such that d(x, y) < D0, it holds that d(h(x), h(y)) < D′0, where D′0 is a constant. If not expressed in formula, it can be stated as: for any pair of feature vectors closer than the threshold D0, there is a uniform bound D′0 on the distance of the anonymized feature vectors (i.e., the re-identified data items). Specifically, the proximity-preserving one-way function h should be such that for some constant D′0, d(h([x, σ]), h([y, σ])) < D′0, where σ is the tracing entitlement identifier. (The distance function may have to be defined differently when applied to re-identified data items because they have different formats or data types, but this is implicit here so as not to burden the notation unnecessarily.) Using a proximity-preserving one-way function will allow tracing clients further downstream in the processing chain to search for matches based on a modified version of proximity test (3), namely:

[0074] d(g j,1 ,g j′,1 ) < D′0. (3′)

[0075] The proximity-preserving property can be achieved by partitioning the feature vectors into shorter sub-vectors (each having one or more components) and hashing each sub-vector separately; in such an implementation, the re-identified data items include multiple sub-hashes of the sub-vectors (salted with the tracing entitlement token). If the feature vectors are partitioned into sub-vectors each having one component and are hashed separately, the resulting re-identified data items can be a vector of the same length as the feature vectors. Further proximity-preserving hash functions are described in "Spectral Hashing" by Y. Weiss et al. in "Advances in Neural Information Processing Systems 21" (NIPS 2008), edited by D. Koller et al., ISBN 9781605609492:

[0076] re-identified data item g 1,1 ,g 2,1 ,g 3,1The reidentified data items are labeled with corresponding positions X(m1), X(m2), X(m3), ..., and are exposed to the tracking client, which is allowed to perform visual tracking in the FOV of the first subgroup 111.1 of the image source, in step 213.1. Exposing the reidentified data items to the tracking client may include sending them to the tracking client in a message or storing them in a shared memory that the tracking client has access to. The shared memory may be configured as a publish / subscribe (Pub / Sub) messaging service. In this example, the first tracking client 140.1 is allowed to perform visual tracking in the FOV of the first subgroup 111.1 of the image source. Labeling the reidentified data items does not necessarily imply any modification to the reidentified data items themselves; rather, it can be done by storing the reidentified data items and their positions in a public data structure (such as...). Figure 4 Labeling is achieved in the associated fields of the table (as shown) and / or by creating a pointer from that location to the re-identified data item or another computer-readable association, and vice versa.

[0077] For the second subgroup 111.2 of the image source, step 212 includes providing a first identification data item 212.2 by anonymizing each feature data item using a predefined one-way function h modified by the second tracking permission token σ2, as follows:

[0078] g i,2 =h([f(m) i ),σ2]), i∈I2, (5)

[0079] Among them, index sets I1 and I2 may be disjoint or have non-zero overlap. These re-identified data items g 1,2 ,g 2,2 ,g 3,2 ... are labeled with corresponding positions X(m1), X(m2), X(m3), ..., and exposed to tracking clients that are permitted to perform visual tracking within the FOV of the second subgroup 111.2 of the image source. In this example, the second tracking client 140.2, as well as optional third and fourth tracking clients 140.3 and 140.4, are permitted to perform visual tracking within the FOV of the second subgroup 111.2 of the image source.

[0080] exist Figure 4 In this context, the provision of the first and second identification data items corresponds to procedures 403.1 and 403.2, respectively. The second identification data item is illustrated as procedure 404, which uses positional annotations. A corresponding procedure (not shown) exists for the first identification data item.

[0081] Method 200 further includes step 214 of preventing access to the feature vectors. Measures will be taken to prevent any party from inspecting or accessing the feature vectors, or to make such attempts very difficult. To prevent access to the feature vectors, as explained above, the feature vectors may be transmitted over a data connection protected against eavesdropping and other attacks by unauthorized parties. Furthermore, when storing or caching the feature vectors, adequately protected memory may be used. Note that, at least in some embodiments, step 214 is limited to preventing access to those feature vectors under the control of the anonymization layer 130, such as feature vectors stored in the memory of a component acting as the anonymization layer. In this embodiment, other components of the visual tracking system 100 may be responsible for preventing access outside the control of the anonymization layer 130, i.e., protecting access to the data connection upstream of the anonymization layer 130 over which feature vectors are transmitted.

[0082] Tracking Client

[0083] Go to Figure 3 The behavior of the individual tracking client 140 will now be described according to method 300. It is understood that the same operation can be performed in a general-purpose processor. Method 300 will be described from the perspective of a tracking client authorized to visually track targets in the FOV of a first set 110.1 of image sources (i.e., for which the first tracking permission token σ1 has been used to generate re-identification data items). In the running example, this corresponds to the first tracking client 140.1.

[0084] In the initial step 310, the tracking client receives the first identification data item g. i,1 This may include the first identification data item g in the received message. i,1 Or they can be retrieved from shared memory. In fact, since the range of visual tracking is spatially defined by the non-matching (1), it is acceptable to make the first recognition data item g i,1 and any second-level identification data item g i,2 The first and third identification data items are available in the same shared memory. Each first identification data item originates from the feature vector f(m) i The eigenvector f(m) i ) represents a sub-region m of the image obtained from the image sources in the first group. i The visual appearance of the tracked target is shown, and each first-level identification data item is labeled with the location X(m) of the corresponding sub-region. i)。As explained above, using the predefined one-way function h modified by the first tracking permission token σ1 specific to the first group of image sources more precisely provides the first re-identification data item. To execute the method 300, it is not necessary for the tracking client to confirm that the first re-identification data item has been provided in this particular manner; the fact that the re-identification data item is an anonymized feature data item may even be opaque to the tracking client.

[0085] In step 311, the tracking client then continues to search for a matching re-identification data item among the received first re-identification data items g i,1 .

[0086] If the underlying feature vector f(m i ) is defined as discrete values or otherwise (e.g., projection on a low-dimensional subspace, rounded to integer values) such that it is considered a match only when equation (2) is satisfied, then the re-identification data item also matches only when all its components are equal. That is, the tracking client searches the set G P1 of first re-identification data items such that all data item pairs g j,1 , g j′,1 ∈ G P1 satisfy the equality condition g j,1 [[ID=,g j′,1 . Since the one-way function h modified by the first tracking permission token σ1 is collision-free, the equality condition implies that the underlying feature vectors are also equal, f(m j ) = f(m j′ ). Each such set G P1 , G P2 , G P3 can be considered to correspond to a tracking target P1, P2, P3.

[0087] If instead, the underlying feature vector is defined according to the above criterion (3) such that a non-zero separation less than the threshold D0 is accepted as a match, and a proximity-preserving one-way function is used, the tracking client evaluates whether the re-identification data item matches by testing (3'). The set G P1 can be filled iteratively according to the following rule: such that if the set contains at least one re-identification data item g j,1 such that d(g j,1 , g k,1 ) < D′0, then the new re-identification data item g k+1,1 should be added to the set G P1 .

[0088] In step 312, for each of the sets of mutually matching reidentified data items, the tracking client tracks the corresponding tracking target P1, P2, P3 based on the positions used to label the mutually matching reidentified data items. The output of step 312 may have a "trajectory" format, i.e., the trajectory of the tracking targets P1, P2, P3, from which the position of the tracking target as a function of time can be understood, for example, a table mapping time points to positions, and vice versa. Various graphical output formats for step 312 are also possible.

[0089] This concludes the description of the basic functionality of the visual tracking system 100 according to its basic architecture. Some alternative implementation methods will now be described.

[0090] Optional implementation methods

[0091] Regarding the architecture of the visual tracking system 100, Figure 1B It is shown that in the following aspects, it is related to Figure 1A Different structures.

[0092] Anonymization layer 130 is implemented in a distributed manner. Processing suitable for anonymization layer 130 is performed in corresponding processing circuits 121.1, 121.3, and 121.4 located in the same location as the corresponding (physical) image source. This corresponds to... Figure 4 Processes 403, 404 and Figure 2 Steps 212, 213, and 214 in the diagram. The advantage of performing these processes in the processing circuitry located in the same location as the image source is that it avoids transmitting image data and / or feature vectors over external and potentially insecure data networks, which is in the interest of data security.

[0093] The process of providing feature vectors is also performed in processing circuits 121.1, 121.3, and 121.4, located in the same area as the image source. This corresponds to... Figure 4 Processes 401, 402 and Figure 2 Steps 210 and 211 in the process.

[0094] During a common process executed on processing circuitry 121.1, re-identification data items from two (physical) image sources 120.1 and 120.2 are provided using a first tracking permission token σ1. Image sources 120.1 and 120.2 can be considered to constitute subgroup 111.1. The fields of view (FOV) of image sources 120.1 and 120.2 partially overlap.

[0095] Another process executed on the processing circuit 121.3 uses a first tracking permission token σ1 to provide re-identification data items from a further image source 120.3. The first tracking permission token σ1 is therefore specific to the group of image sources consisting of image source 120.1, image source 120.2, image source 120.3 (and possibly more).

[0096] The two image sources 120.4 and 120.5 correspond to different halves of the field of view (FOV) of a single physical image source. The segmentation of the FOV of the physical image sources can be achieved by optical devices or digital image processing. In a common process executed on the processing circuit 121.4, re-identification data items from the two (virtual) image sources 120.4 and 120.5 are provided using a second tracking permission token σ2.

[0097] Regarding re-identification / tracking permissions, the tracking client 140 and image source 120 can have a one-to-one, one-to-many, many-to-one, or many-to-many relationship. Here, tracking clients 140.1 and 140.2 are authorized to use the first re-identification data item g in the FOVs of image sources 120.1, 120.2, and 120.3. i,1 Re-identification is performed. The second tracking client 140.2 is also authorized to use the second re-identification data item g in the FOVs of the fourth image source 120.4 and the fifth image source 120.5. i,2 Perform re-identification.

[0098] Figure 1A and Figure 1B The aforementioned visible differences illustrate the structural changes that can be practiced when implementing the teachings of this disclosure. These changes can be practiced individually or in various combinations.

[0099] In some implementations, the scope of visual tracking is defined relative to time, i.e., by adding step 216 to method 200, in which the anonymization layer 130 stops using the first tracking permission token σ1 after its expiration. The anonymization layer 130 can then anonymize the feature vector using a predefined one-way function h modified by an alternative first tracking permission token σ′1, different from σ1. As explained above, the first tracking client will be unable to find a match between re-identified data items generated before and after the token replacement, even if it considers two re-identified data items generated based on the same feature vector.

[0100] The validity period of the first tracking permission token σ1 can be predetermined, such as every hour or every day. If not, the expiration time can be dynamically determined. For this purpose, the inventors have envisioned two different but equivalent solutions, which are suitable depending on whether the anonymization layer 130 is implemented using a centralized or distributed architecture.

[0101] In such Figure 1B In the distributed implementation shown, method 200 includes step 215a, negotiating between a first device 130.1 (e.g., processing circuitry 121.1) executing an instance of method 200 using a first tracking permission token σ1 and at least one other device 130.2 (e.g., processing circuitry 121.3) executing a corresponding further instance of method 200 using the same first tracking permission token σ1. Negotiation step 215a may begin with a device proposing a due date in a common time base (network time), after which the other devices unanimously approve the proposal, or at least one other device rejects the proposal when a counter-proposal is made for the due date. These devices are configured to comply with the approved due date.

[0102] Whenever the coordinating entity 150 is available in the visual tracking system 100—that is, regardless of whether the anonymization layer 130 is implemented centrally or distributed—the expiration date of the first tracking permission token σ1 can be determined by a decision made by the coordinating entity 150. In a distributed implementation ( Figure 1B The decision regarding the validity period is sent from the coordinating entity 150 to various clusters of processing circuits 121 located in the same location as the image source 120. These clusters together act as the anonymization layer 130 of the visual tracking system 100. The clusters of processing circuits that execute the corresponding instances of method 200 and receive the decision regarding the validity period from the coordinating entity 150 (step 215b) are configured to behave in accordance with the determined expiration time, i.e., to cease using the tracking permission token. In a centralized implementation scheme ( Figure 1A The coordinating entity 150 internally receives its decisions, that is, when the expiration time is reached, it makes a decision and behaves accordingly. To avoid ambiguity, it should be clarified that even if the processes related to the anonymization layer 130 are delegated to other components of the visual tracking network 100, the coordinating entity 150 can still be responsible for decisions regarding the expiration of the validity period.

[0103] The expiration of a tracking permission token primarily affects the anonymization layer. However, it can also be passed to tracking clients using re-identified data items generated by the anonymization layer, allowing these clients to limit their searches (step 311 in method 300) to match re-identified data items to the expiration of each tracking permission token in a timely manner. This avoids wasting processing resources on meaningless searches, knowing that tracking clients will never find a match between re-identified data items generated before and after the tracking permission token replacement.

[0104] Conclusion

[0105] In summary, a visual tracking system has been proposed in which an image source performs object detection on captured images to determine the feature vector and location information (e.g., bounding box) for each detected object or person (hereinafter: target). Each image source irreversibly anonymizes the determined feature vector along with a tracking authorization token to create an anonymized feature vector (re-identified data item). The image source then sends the re-identified data item along with the associated location information as metadata to a tracking client authorized to perform target tracking in the relevant area.

[0106] The proposed solution can be described as a method for tracking targets detected by an image source within a visual tracking system (camera system). In a broader sense, it performs the following actions.

[0107] Each image source is provided with a set of tracking permission tokens. Each tracking permission token is valid for a predetermined period.

[0108] Furthermore, it implements different rights at the technical level to track clients and track targets.

[0109] The image source creates a metadata structure for each detected target, including anonymized feature vectors and the target's location information. Furthermore, the anonymized feature vector is an anonymized combination of the detected target's feature vectors and the corresponding tracking permission token from the set. Ideally, the anonymization of the combination of the feature vectors and the set's corresponding correct identifier is collision-resistant and irreversible.

[0110] For each image source and each detected target, the metadata structure is transmitted to the tracking client device authorized to track the detected target.

[0111] Each tracking client that receives two or more metadata structures compares the anonymized feature vectors of the different metadata structures. When it finds a match between the anonymized feature vectors, it adds the location information of the two received metadata structures to the trajectory associated with the anonymized feature vector (which in turn can be associated with the detected target).

[0112] The foregoing has primarily described aspects of this disclosure with reference to several embodiments. However, as will be readily apparent to those skilled in the art, other embodiments besides those disclosed above are also possible within the scope of the invention as defined by the appended claims.

Claims

1. A visual tracking system, comprising: Multiple image sources are configured to provide images within their respective fields of view. Each image source is configured to detect, in an image obtained from the image source, a respective sub-region containing the tracking target. ; and for each sub-region, calculate a feature vector representing the visual appearance of the tracked target therein. ; The anonymization layer is configured to: use a first tracking permission token Modified predefined one-way function Anonymize each feature vector to provide the first layer of identification data. The first tracking permission token is specific to at least one image source within a first group of the image sources; By using a second tracking permission token Modified predefined one-way function Anonymize each feature vector to provide a second layer of identification data. The second tracking permission token is specific to at least one image source in the second group of the image sources and is different from the first tracking permission token, wherein the first group and the second group of the image sources at least partially overlap; and the location of the corresponding sub-region is exposed to the tracking client. The first and second identification data items, while preventing access to the feature vector; Multiple tracking clients are configured to perform re-identification of a tracked target using data obtained from the anonymization layer, including: a first tracking client authorized to perform re-identification within a first group of views of the image source and acquire a first re-identification data item with location annotations; and a second tracking client authorized to perform re-identification within a second group of views of the image source and acquire a second re-identification data item with location annotations. The anonymization layer is implemented in a processing circuit that is separate from the tracking client.

2. The visual tracking system according to claim 1, wherein, The anonymization layer is implemented in the processing circuitry of the coordinating entity in the visual tracking system and / or in the processing circuitry located at the same location as the corresponding image source in the image source. The coordinating entity is implemented in a dedicated server or host in a computer network, or implemented as a process executed on a computer or virtual machine that has additional responsibilities in the visual tracking system.

3. A method for providing anonymized data to facilitate re-identification in a visual tracking system, the method comprising: Images obtained from multiple image sources Each of the detected sub-regions contains the target being tracked. ; For each sub-region, a feature vector representing the visual appearance of the tracked target within it is calculated. ; For the first subgroup of the image source: by using a first tracking permission token Modified predefined one-way function Anonymize each feature vector to provide the first layer of identification data. ; and publicly display the location of the corresponding sub-region to the first tracking client. The first identification data item; For the second subgroup of the image source: by using a second tracking permission token that is different from the first tracking permission token. Modified predefined one-way function Anonymize each feature vector to provide a second layer of identification data. ; and the second identification data item, which is marked with the location of the corresponding sub-region, is disclosed to the second tracking client; Prevent access to the feature vector. Wherein, the first tracking permission token A first group specific to the image source including the first subgroup, and / or the second tracking permission token. Specifically, a second group of image sources including the second subgroup, wherein the first group of the image sources and the second group of the image sources at least partially overlap.

4. The method of claim 3, further comprising: In the first tracking permission token The first tracking permission token will cease to be used after its validity period expires. .

5. The method of claim 4, further comprising: The apparatus for executing the method and the use of the first tracking permission token Performing actions regarding the first tracking permission token between at least one other device. The negotiation of the expiration date, wherein the negotiation concludes with the unanimous approval of the proposed expiration date.

6. The method of claim 4, further comprising: Receive the first tracking permission token determined by the coordinating entity in the visual tracking system. The expiration indication of the validity period, The coordinating entity is implemented in a dedicated server or host in a computer network, or implemented as a process executed on a computer or virtual machine that has additional responsibilities in the visual tracking system.

7. The method according to any one of claims 4 to 6, wherein, In the first tracking permission token After the expiration of the validity period, a replacement first tracking permission token is used. Modified predefined one-way function The feature vectors of the first subgroup of the image source are anonymized.

8. The method according to claim 3, wherein: The feature vector The calculation takes values ​​from a discrete set; and / or Used to provide the re-identification data item The one-way function It is about maintaining proximity.

9. The method according to claim 3, wherein, The provision of the re-identified data item includes a tracking permission token that modifies the predefined one-way function. As a crypto salt application.

10. The method according to claim 3, wherein, The tracking targets include people and / or inanimate objects.

11. A method for visually tracking a target in a first set of fields of view of an image source, the method comprising: At the anonymization layer of the visual tracking system, anonymized data is provided by performing the method described in claim 3; as well as At the first tracking client in the visual tracking system, the following steps are performed: Receive the first identification data item Each first-level identification data item is labeled with the location of the corresponding sub-region; Search for matching re-identification data items in the first re-identification data items; as well as For a set of mutually matching re-identification data items, a tracking target is tracked based on the positions used to mark the mutually matching re-identification data items.

12. The method according to claim 11, wherein, The re-identified data items matched in the search are time-limited to the first tracking permission token. Within the validity period.

13. The method according to claim 11 or 12, further comprising: At the second tracking client in the visual tracking system, the following steps are performed: Receive the second identification data item Each second-level identification data item is labeled with the location of the corresponding sub-region; Search for matching re-identification data items in the first re-identification data items; as well as For a set of mutually matching re-identification data items, a tracking target is tracked based on the positions used to mark the mutually matching re-identification data items.

14. An apparatus or cluster of apparatuses including a processing circuit configured to perform the method of claim 3.

15. A computer program comprising instructions that, when executed by a computer, cause the computer to perform the method according to claim 3.