An image is encoded to define one or more spatial regions that can be sensed by a suitably-equipped 
mobile device (e.g., a smartphone), but are imperceptible to humans. When such a 
mobile device senses one of these regions, it takes an action in response (e.g., rendering an associated tone, playing linked video, etc.). The regions may overlap in layered fashion. One form of encoding employs modification of the color content of the image at higher spatial frequencies, where human vision is not acute. In a particular embodiment, the encoding comprises altering a transform domain representation of the image by adding 
signal energy in a first 
chrominance channel, where the added 
signal energy falls primarily within a segmented arc region in a transform 
domain space. In another arrangement, a smartphone display presents both image data captured from a scene, and a transform representation of the image data (e.g., in the 
Fourier domain). This latter information can aid a user in positioning the phone, e.g., to enhance decoding of a steganographic digital 
watermark. In still another arrangement, foveal filtering is applied to of smartphone-captured image data in connection with other 
image processing.