Beyond neural activity prediction: Probing latent representations in mouse V1 digital twins
Abstract
Digital twins of sensory cortex serve as powerful response oracles for predicting neural activity to novel stimuli. Although prediction accuracy is the central metric by which these models are evaluated, it provides limited insight into the latent representations that support those predictions. This limitation becomes increasingly important as digital twins are used as in silico experimental systems for stimulus design and hypothesis generation: models with similar prediction accuracy may nevertheless rely on different latent representations. We address this gap by systematically probing a family of convolutional–recurrent digital twins of mouse V1 trained to predict neural activity from naturalistic videos recorded in freely moving mice. The models share the same training data and neural-prediction objective, but differ in visual-encoder architecture. For each frozen model, we characterize latent representations along three levels: (i) linear decodability from controlled visual probes of orientation, contrast, and motion; (ii) latent-unit tuning to canonical visual features including orientation selectivity, contrast response, spatial-frequency tuning, and phase sensitivity; and (iii) population geometry of hidden-layer activity. Across architectures, better neural-response prediction is associated with stronger probe accuracy and with hidden-population geometry shifted toward higher dimensionality, with flatter eigenspectra closer to those reported for mouse V1. Although these representational properties covary with prediction accuracy across architectures, most strongly for population geometry, twins with comparable prediction scores can still differ substantially in probe performance and latent-unit tuning. These results establish multi-level representational probing as a complement to standard neural-prediction evaluation, providing a framework for understanding digital twins not only as predictors, but also as substrates for studying visual computations.