Data Science for the study of Alzheimer's disease: patterns hidden in structural connectivity

Download the Paper delivered through the collaboration between Exprivia and the Universidad de Zaragoza.
An important result achieved thanks to the virtuous relationship between academic and industrial research.

Authors:
Ernesto Estrada
Institute of Applied Mathematics (IUMA), Universidad de Zaragoza and ARAID Foundation, Government of Aragon
Eufemia Lella
Innovation Lab Exprivia and Istituto Nazionale di Fisica Nucleare, sezione di Bari

Ask for the Paper

Among the biggest challenges of research in Neuroscience is to shed light on the mechanisms of neurodegeneration that originate brain diseases. Data Science provides an essential tool to extract seemingly hidden information from clinical, genetic and imaging data. In particular, diffusion-weighted magnetic resonance imaging provides information on the characteristics of the structural connection between brain regions and therefore allows us to study how this changes in the presence of the disease. The identification of structural patterns leads to the discovery of quantitative biomarkers, paving the way for the use of Artificial Intelligence for the diagnosis, as early as possible, of neurodegenerative diseases.

Neuroscience, equipped with the tools of Data Science and AI, today plays a fundamental role, particularly in research on Alzheimer's disease, the most widespread and disabling neurodegenerative disease for which, to date, there is no cure. A great challenge is that of early diagnosis, which represents an important goal in the perspective of testing new drugs.

Data Science Methodologies in Neuroscience: the brain as a complex network
The starting data, provided by MRI examinations, are 3D images of the brain and the gray level of each voxel has a meaning and information content related to the specific type of examination performed. These images constitute "raw" data and therefore, to interpret and extract knowledge from them, the use of methodologies to model and filter the information is crucial. Data Science, with statistical and Machine Learning techniques, provides the tools and methodologies to extract information from large amounts of data, starting from their "raw" format. Particularly in Neuroscience, on brain imaging data, feature engineering methodologies often use the extraction of connectivity in order to effectively extrapolate information related to connectivity between brain regions, thanks to Complex Network Theory.

The brain is the most complex of the complex systems, made up of different parts that in their interaction produce behaviors that would not emerge considering the individual parts individually. By reconstructing, through so-called "tractography" algorithms, the fibrous traits connecting its anatomical regions, the brain can be modeled as a complex network represented by a graph whose nodes are the anatomical regions and whose links are weighed according to the number of fibrous traits connecting them. The topological and connectivity characteristics of the graph are studied with the Theory of Complex Networks which allows to quantitatively describe the connectivity of the brain and to highlight its alterations in the presence of disease. For example, with the Complex Networks Theory it is possible to determine the nodes with more connections in the network, the so-called "hubs", or the most important nodes and connections from the point of view of information traffic in the network, or the tendency of the nodes to form clusters or the tendency of the network to form modules. These configurations in the brain network may vary in the presence of the disease.

Communicability Distance highlights mechanisms of connectivity alteration in Alzheimer's disease
The classic concept of distance between two nodes of a network is associated with the length of the path, i.e. the number of links that make up the path between two nodes. The most common idea in the field of Complex Network Theory is that the communication between two nodes of a network occurs through the path of minimum length between them. But in many networks in the real world, information can also flow through paths that are not minimal, think for example of the spread of gossip in social networks, where information can flow back and forth several times before reaching its final destination. In these cases, the path of minimum length is not sufficient to fully describe the communicability within the network.

In 2008 Ernesto Estrada and Naomichi Hatano proposed a new metric of complex networks, the communicability, introducing a new concept of communication within the network. Communicability takes into account not only the path of minimum length between two nodes, but all possible paths between them, giving greater weight to those of lesser length. Starting from the concept of communicability Ernesto Estrada has introduced a new concept of distance in complex networks, the communicability distance between two nodes, associated with the difference between the amount of information that starting from one node returns to itself and the amount of information that starting from one node reaches the other node. These new concepts of communication and distance in networks are particularly suitable to study those networks in which information flows through a diffusion process, as it happens in the brain connectoma extracted from MRI images weighed in diffusion. Moreover, this new concept of distance is strongly suitable to characterize the connectivity alterations in a brain disease such as Alzheimer's disease that some researches hypothesize to be a "disconnection syndrome" and a disease related to the loss of integrity of white matter fibers that represent the communication pathways between brain regions.

The communicability distance has been used to highlight alterations in connectivity between brain regions in Alzheimer's patients. Starting from the Susceptible-Infected (SI) mathematical epidemiological model of Disease Factor (DF) propagation, it was shown that the communicability distance represents the difference between the circulability of DF around two regions and its transmissibility from one region to another. Starting from the communicability distance, the minimum length path in terms of communicability between pairs of brain regions was calculated and it was found that this measure is much more significant, compared to the classic minimum length path, in detecting differences in connectivity between the healthy group and the sick group. Using feature selection methodologies in Data Science, such as statistical permutation tests, 399 pairs of regions with statistically significant differences in the minimum length pathway in terms of communicability were found, most of which involve regions belonging to two different hemispheres or both regions of the left hemisphere or Vermis connections, finding results in agreement with the hypothesis that Alzheimer's disease is a disconnection syndrome. In addition, it was found that the connections that have on average a greater difference in walking distance of minimum length in terms of communicability, between the group of the healthy and the group of patients, concern clusters of regions that include different areas of the cerebellum, the Vermis and the Amygdala. A very interesting result, which emerged thanks to the use of statistical techniques of Data Science, and which represents a pattern of connectivity of the brain of Alzheimer's patients, is that in 76.9% of the damaged brain regions the minimum path length in terms of communicability decreases in patients compared to the healthy. This counterintuitive result indicates how Alzheimer's disease transforms the brain network into a more efficient system in the transmission of the disease, because it decreases the circulability of DF around brain regions compared to its transmissibility to other regions.

New perspectives for AI/IV in the diagnosis of Alzheimer's disease
The discovery of new biomarkers of neurodegenerative diseases and patterns related to the mechanisms of alteration of brain functioning, always open new and intriguing perspectives for the use of AI in automatic diagnosis as early as possible. In particular, these recent scientific results suggest the effectiveness of the use of a new distance metric to describe the flow of information between brain regions, opening the way to new analysis on the topology and organization of the brain network and therefore to the extraction of new information features to be given as input to AI algorithms to formulate predictive models and to implement new tools for automatic diagnosis.

Request the Paper

Fill in the form and enter your data.