Kendrick Kay
Profile Url: kendrick-kay
Researcher at University of Minnesota
NeuroImage, 2015-04-18
Intrinsic cortical dynamics are thought to underlie trial-to-trial variability of visually evoked responses in animal models. Understanding their function in the context of sensory processing and representation is a major current challenge. Here we report that intrinsic cortical dynamics strongly affect the representational geometry of a brain region, as reflected in response-pattern dissimilarities, and exaggerate the similarity of representations between brain regions. We characterized the representations in several human visual areas by representational dissimilarity matrices (RDMs) constructed from fMRI response-patterns for natural image stimuli. The RDMs of different visual areas were highly similar when the response-patterns were estimated on the basis of the same trials (sharing intrinsic cortical dynamics), and quite distinct when patterns were estimated on the basis of separate trials (sharing only the stimulus-driven component). We show that the greater similarity of the representational geometries can be explained by the coherent fluctuations of regional-mean activation within visual cortex, reflecting intrinsic dynamics. Using separate trials to study stimulus-driven representations revealed clearer distinctions between the representational geometries: a Gabor wavelet pyramid model explained representational geometry in visual areas V1–3 and a categorical animate–inanimate model in the object-responsive lateral occipital cortex.
NeuroImage, 2017-08-06
The goal of cognitive neuroscience is to understand how mental operations are performed by the brain. Given the complexity of the brain, this is a challenging endeavor that requires the development of formal models. Here, we provide a perspective on models of neural information processing in cognitive neuroscience. We define what these models are, explain why they are useful, and specify criteria for evaluating models. We also highlight the difference between functional and mechanistic models, and call attention to the value that neuroanatomy has for understanding brain function. Based on the principles we propose, we proceed to evaluate the merit of recently touted deep neural network models. We contend that these models are promising, but substantial work is necessary to (i) clarify what type of explanation these models provide, (ii) determine what specific effects they accurately explain, and (iii) improve our understanding of how they work.
Journal of Mathematical Psychology, 2016-11-19
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network, and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies in the training set did not match their frequencies in natural experience or their behavioural importance. The latter factors might determine the representational prominence of semantic dimensions in higher-level ventral-stream areas. Our results demonstrate the benefits of testing both the specific representational hypothesis expressed by a model's original feature space and the hypothesis space generated by linear transformations of that feature space.
How variable is the functionally-defined structure of early visual areas in human cortex and how much variability is shared between twins? Here we quantify individual differences in the best understood functionally-defined regions of cortex: V1, V2, V3. The Human Connectome Project 7T Retinotopy Dataset includes retinotopic measurements from 181 subjects, including many twins. We trained four "anatomists" to manually define V1-V3 using retinotopic features. These definitions were more accurate than automated anatomical templates and showed that surface areas for these maps varied more than three-fold across individuals. This three-fold variation was little changed when normalizing visual area size by the surface area of the entire cerebral cortex. In addition to varying in size, we find that visual areas vary in how they sample the visual field. Specifically, the cortical magnification function differed substantially among individuals, with the relative amount of cortex devoted to central vision varying by more than a factor of 2. To complement the variability analysis, we examined the similarity of visual area size and structure across twins. Whereas the twin sample sizes are too small to make precise heritability estimates (50 monozygotic pairs, 34 dizygotic pairs), they nonetheless reveal high correlations, consistent with strong effects of the combination of shared genes and environment on visual area size. Collectively, these results provide the most comprehensive account of individual variability in visual area structure to date, and provide a robust population benchmark against which new individuals and developmental and clinical populations can be compared.
eneuro, 2018-06-28
One of the major challenges in visual neuroscience is represented by foreground-background segmentation. Data from nonhuman primates show that segmentation leads to two distinct, but associated processes: the enhancement of neural activity during figure processing (i.e., foreground enhancement) and the suppression of background-related activity (i.e., background suppression). To study foreground-background segmentation in ecological conditions, we introduce a novel method based on parametric modulation of low-level image properties followed by application of simple computational image-processing models. By correlating the outcome of this procedure with human fMRI activity measured during passive viewing of 334 natural images, we reconstruct easily interpretable 'neural images' from seven visual areas: V1, V2, V3, V3A, V3B, V4 and LOC. Results show evidence of foreground enhancement for all tested regions, while background suppression specifically occurs in V4 and LOC. 'Neural images' reconstructed from V4 and LOC revealed a preserved spatial resolution of foreground textures, indicating a richer representation of the salient part of natural images, rather than a simplistic model of object shape. Our results indicate that scene segmentation is an automatic process that occurs during natural viewing, even when individuals are not required to perform any particular task.
Most functional magnetic resonance imaging (fMRI) is conducted with gradient-echo pulse sequences. Although this yields high sensitivity to blood oxygenation level dependent (BOLD) signals, gradient-echo acquisitions are heavily influenced by venous effects which limit the ultimate spatial resolution and spatial accuracy of fMRI. While alternative acquisition methods such as spin-echo can be used to mitigate venous effects, these methods lead to serious reductions in signal-to-noise ratio and spatial coverage, and are difficult to implement without leakage of undesirable non-spin-echo effects into the data. Moreover, analysis heuristics such as masking veins or sampling inner cortical depths using high-resolution fMRI may be helpful, but sacrifice information from many parts of the brain. Here, we describe a new analysis method that is compatible with conventional gradient-echo acquisition and provides venous-free response estimates throughout the entire imaged volume. The method involves fitting a low-dimensional manifold characterizing variation in response timecourses observed in a given dataset, and then using identified early and late timecourses as basis functions for decomposing responses into components related to the microvasculature (capillaries and small venules) and the macrovasculature (veins), respectively. We show that this Temporal Decomposition through Manifold Fitting (TDM) method is robust, consistently deriving meaningful timecourses in individual fMRI scan sessions. Moreover, we show that by removing late components, TDM substantially reduces the superficial cortical depth bias present in gradient-echo BOLD responses and eliminates artifacts in cortical activity maps. TDM is general: it can be applied to any task-based fMRI experiment, can be used with standard- or high-resolution fMRI acquisitions, and can even be used to remove residual venous effects from specialized acquisition methods like spin-echo. We suggest that TDM is a powerful method that improves the spatial accuracy of fMRI and provides insight into the origins of the BOLD signal.