You are here
To increase the power of neuroimaging analyses, it is common practice to reduce the whole-brain search space to a subset of hypothesis-driven regions-of-interest (ROIs). Rather than strictly constrain analyses, we propose to incorporate prior knowledge using probabilistic ROIs (pROIs) using a hierarchical Bayesian framework. Each voxel prior probability of being “of-interest” or “of-non-interest” is used to perform a weighted fit of a mixture model. We demonstrate the utility of this approach through simulations with various pROIs, and the applicability using a prior based on the NeuroSynth database search term “emotion” to thresholding the fMRI results of an emotion processing task. The modular structure of pROI correction facilitates the inclusion of other innovations in Bayesian mixture modeling, and offers a foundation for balancing between exploratory analyses without neglecting prior knowledge.
Incorporation of prior knowledge poses the question of its source. Meta-analyses tools such as Neurosynth use coordinates reported in papers and therefore are missing many subthreshold effects. To improve this situation we propose NeuroVault: a lightweight, easy-to-use web database. The core principle behind its development was to streamline the amount of time and effort needed for a user to submit their SPMs. At the same time we are giving users the ability to provide more details that can potentially improve the quality of meta-analysis. We also provide a simple API for developers to pull the existing data by making simple queries over DOIs of papers and/or meta-data fields.
Data sharing in the broader sense is still problematic in our field. Although data sharing has been a cornerstone of large-scale data consortia, the incentive for the individual researcher remains unclear. Other fields have benefited from embracing a data publication form – the data paper – that allows researchers to publish their datasets as a citable scientific publication. Such publishing mechanisms both give credit that is recognizable within the scientific ecosystem, and also ensure the quality of the published data and metadata through the peer review process.
Representation of objects by higher–order generalizations of vectors and matrices, also known as multi-way arrays, have become very common for many research areas e.g. image analysis, chemometrics, neuroinformatics, psychometrics and web mining. Surprisingly, multi-way classification tools i.e. that make use of the multi-dimensional structure for the discrimination between classes, have not been much studied. Furthermore, for the few existent tools, contextual information which can be very beneficial in the classification process is not taken into account.
Such is the case of spectra and other processes/objects we are interested in, which have a shape as a function of e.g. time/position/frequency. They are often continuous functions (they don’t jump) and it is in fact their shape that is significant for the classification. However, they are traditionally represented by sampling, i.e. sequence of individual observations, such that an object is represented in a high-dimensional space. This representation in a feature vector or higher-order array is, not optimal as it, generally, considers each feature independently. The continuous nature of the data, relationship between features is ignored; therefore it is difficult to find discriminative spectral characteristics.
The Dissimilarity Representation approach and its potential for data with similar characteristics to those of spectral data sets have been demonstrated in previous studies. The advantages of this approach, including the fact that analysis of spectral data can be enriched by taking into account knowledge about the domain, have lead us to investigate the DR as a new tool for classifying multi-way spectral (or more in general, continuous) data.
Hyperalignment (Haxby et al, Neuron 2011, http://www.ncbi.nlm.nih.gov/pubmed/22017997 ) is a method based on statistical shape analysis to obtain a common model space from single participants' voxel space.Here I will first present basic concepts of statistical shape analysis and illustrate the algorithm to evaluate Procrustes distance. I will then explain how this is applied to FMRI data and transformed in Haxby's idea. Finally, I will discuss the potential application of this method to longitudinal - fmri problems.
Diffusion MRI (dMRI) data allow to reconstruct the 3D path-ways of axons within the white matter of the brain as a set ofstreamlines, called tractography. A streamline is a vectorial rep-resentation of thousands of neuronal axons expressing structuralconnectivity. An important task is to group streamlines belong-ing to a common anatomical area in the same cluster. This taskis known as tract segmentation task, and it is extremely helpfulfor neuro surgery or for diagnosing brain diseases. However, thesegmentation process is difficult and time consuming due to thelarge number of streamlines (about 3 × 105 in a normal brain)and the variability of the brain anatomy among different sub-jects. In our project, the goal is: first, to design an effectivemethod for tract segmentation task based on machine learn-ing and second, to develop an interactive tool to help medicalpractitioners to perform this task more precisely and easily. Wepropose a design of the interactive segmentation process, con-sisting of two steps: tract identification and tract refinement.The tract identification step generates the first hypothesis ofsegmentation to avoid the expert to start segmenting from thewhole tractography. This step uses the manual segmentationexamples from experts to create the candidate of tract segmen-tation, and is conceived as a supervised learning task. The nextstep aims at refining the proposed segmentation and takes placeby removing or adding streamlines. With the goal to aid medi-cal practitioners to perform this refinement task more preciselyand easily, it is necessary to cluster some similar streamlinesinto one set, called bundle. We design it as a clustering task.Some of our preliminary results are used for clinical usecase,such as finding the difference between healthy and ALS (Amy-otrophy Lateral Smytrophic) diseased brains. Based on this, webelieve that with our work, the task of tract segmentation canbe performed more easily, at an acceptable computational cost,high accuracy, and can bring benefit for clinical applications.
Correlations between autonomic activity (AA) fluctuations and the BOLD signal are commonly considered artifacts in neuroimaging analysis. However, activity in certain regions may cause subsequent physiological fluctuations rather than be caused by them.Here we developed a specific experimental setup and a data-analysis method to identify regions where BOLD fluctuations systematically precede physiological ones during task- and resting-state epochs.We chose fast repetition time (fast-TR, 0.4 seconds) fMRI to match the temporal resolution of BOLD and cardiac rate, avoid aliasing and loss of high-frequency information. Cognitive task, to elicit autonomic arousal, was to perform a mathematical calculation for 100s. We combined cardiac pulsation recordings while scanning and then constructed a derived measure reflecting heart rate (HR). We developed a time-shifted sliding-window correlation method to identify voxels where BOLD activity at time T predicted HR in the next 10s.Activity in anterior cingulate systematically preceded HR fluctuations, during both task and rest. In ACC higher BOLD response preceded low-HR epochs. Other regions, instead, show higher activity preceding high-HR epochs.This suggests a heterogeneous nature of a significant BOLD - AA relationship throughout the brain: the commonly used removal of AA-related variance may result in a loss of functionally relevant variance.
Graph-structured data is becoming more and more abundant in many fields of science and engineering, such as social network analysis, molecular biology, chemistry, computer vision, etc. Machine learning methods that are able to efficiently handlegraph data sets are needed to exploit this kind of data. Successfully application of machine learning and data analysis methods to graphs requires the ability to efficiently compare graphs. Graph kernels have attracted considerable interest in the machine learning community in the last decade as a promising solution to this issue.This talk is an introduction to graph kernels, we will review the state-of-the-art, presenting the main strategies for defininggraph kernels and making an analysis of the expressivity and computational complexity. We will also present some experiments reported in the literature and some available codes.
Cluster analysis is an essential task in many fields of research thatinvolves analyzing or processing multivariate data. The application ofseveral clustering algorithms to the same data set could produce verydifferent results. Diverse clustering of the data could be obtained,in such a way that each one of them contributes with some valuableinformation to the problem at hand. The idea of clustering ensemblealgorithms emerges as a way of combining all these information in aconsensus clustering. The result of this consensus should be a morereliable option than the arbitrarily selection of any individualclustering.In this talk, I will present a short review of the state of the art onclustering ensemble algorithms and the main motivations of my PhDproject. I will show the main results of my PhD thesis: Clusteringensemble algorithms based on kernels functions; kernel functions tocompare partitions; clustering ensemble algorithms for heterogeneousdata; a new approach for the selection of a representative level in ahierarchy of partitions using clustering ensemble and an imagesegmentation ensemble algorithm.
Intuitive and efficient, the random subspace ensemble approach provides an appealing solution to the problem of the vast dimensionality of functional magnetic resonance imaging (fMRI) data for maximal-accuracy brain state decoding. Recently, efforts to generate biologically plausible and interpretable maps of brain regions which contribute information to the ensemble decoding task have been made and two approaches have been introduced: globally multivariate random subsampling and locally multivariate Monte Carlo mapping. Both types of maps reflect voxel-wise decoding accuracies averaged across repeatedly randomly sampled voxel subsets, highlighting voxels which consistently participate in high-classification subsets. We compare the mapping sensitivities of both approaches and demonstrate that utilizing spatial relationships yield dramatically improved voxel detection performances.
White matter fiber tracts describe the organization and connectivity of the human brain by means of in vivo diffusion Magnetic Resonance Imaging (dMRI) techniques. Neurological studies are often interested in identifying anatomically meaningful white matter fiber bundles. For this reason the algorithms for clustering fibers into bundles have received wide attention over the last years and a constant effort has been sustained to incorporate prior knowledge. Despite this interest the use of atlas-information and expert-made segmentations have been limited. In this work in progress we focus on this kind of information and propose an algorithm to segment a given fiber bundle of interest from deterministic tractography data by means of binary classification of fiber tracts. The classifier is built from expert-made examples and addresses the case of multiple subjects. In this analysis we compare the popular k-Nearest Neighbour classification algorithm against the proposed dissimilarity-based approach and discuss the latter in the context of kernel methods. We show that the proposed method provides the means to address the supervised fiber bundle segmentation problem from the vast majority of the algorithms of the machine learning literature motivating new interesting lines of research.
Tensors are a powerful way to represent and analyze the most diverse type of data with applications in image and video recognition, EEG and fMRI, text analysis and recommendation systems, to name a few. In this presentation we go through four parts: A) We begin with a bird-eye view over this domain: what are tensors? why are they useful? what are the established approaches in tensor-based data analysis? B) As it turns out tensor based techniques are mostly based on generalization of the singular value decomposition. In this work we take a different perspective and rely on convex optimization. We study a broad class of non-smooth convex optimization problems for tensors. A penalty based on nuclear norms is used to enforce solutions with small (multilinear) ranks. A simple yet effective algorithm, termed Convex MultiLinear Estimation (CMLE), is proposed. C) We show how this algorithm can be specialized to accomplish different data-driven modeling tasks. Extending the existing taxonomy of learning to the case where input (and possibly output) patterns are represented as tensors, we can called these problems unsupervised or supervised. This generalization is instrumental to deal with important aspects - often overlooked in the tensor literature - such as the choice of loss functions, model selection, regularization and out-of-sample extensions. D) We present concrete examples ranging from image and video completion to low rank denoising and classification. A particular attention is devoted to present a case study on EEG data. Several epileptic seizure detection systems apply traditional machine learning techniques to differentiate between ictal and non-ictal EEG segments. They generally work based upon single channels and lead to ignore the spatial distribution of the ictal pattern. We show how a tensor based learning approach can overcome these limitations.