select search filters
roundups & rapid reactions
before the headlines
Fiona fox's blog

expert reaction to study using Google Street View to reveal how the built environment correlates with risk of cardiovascular disease

A study published in The European Heart Journal looks at a correlation between the built environment and cardiovascular disease. 


Kevin McConway, Emeritus Professor of Applied Statistics, The Open University, said:

“This piece of research is an interesting start along a new road in investigating variation in between different places in the risk of coronary heart disease (CHD, that is, heart attacks, angina, and related conditions). But, despite the heroic scale of the data analysis, it can’t yet clearly establish how useful these new tools will eventually turn out to be, or for what specific purposes. And, so far, it tells us little new, if anything, about what actually causes people to develop CHD, or about what we might do to lessen the risks. This line of research might well develop into something very useful, but I’d say that so far it’s much too early to say which way things will go.

“An obvious issue is that, to investigate the causes of any disease, or possible interventions to reduce the risk, we’ve got to have information on what causes what. However, this study is observational, and it works entirely at the level of areas and their populations, not at the level of individual people. Both of those features make it difficult to say anything about what causes what at an individual level.

“There will be many differences between people that live in neighbourhoods that look different on Google Street View, apart from what can be seen visually in those neighbourhoods. These other differences might be the actual cause of differences in CHD risk, and not the different appearance of the neighbourhoods at all. That’s an issue with all observational research. The research can show correlations, but correlation doesn’t necessarily mean causation.

“For instance, both the new research paper and the accompanying editorial mention that pollution, particularly air pollution, has previously been shown to be correlated with increased CHD risk. Exactly what detailed role air pollution has in causing CHD isn’t always so clear, but it’s interesting that the new research did not use any data on air pollution levels in the areas it looked at. Air pollution isn’t included in the ‘traditional’ demographic and socio-economic factors which the researchers used in their statistical models for comparison with what they found from the Google photos, nor in the three composite indices of social determinants of health that they also used. Maybe higher pollution levels could somehow affect how the areas looked on Google Street View and hence enter the statistical models indirectly, but it would at least have been interesting to include pollution measures more directly in the modelling.

“I’m not commenting here on how a strong a role pollution levels might play in understanding the causes of CHD – but there could be an important role, this research can’t investigate it, and anyway pollution is just one example of factors that weren’t directly considered.

“At least there’s a possibility of measuring air pollution, and using those measures in future research. But there are many other possible aspects of cause and effect that might be at work. The researchers found that having buildings and roads in bad condition in an area, as seen on Google Street View, was correlated with higher risks of CHD, while having more trees and more houses in good condition was associated with lower CHD risk. But this doesn’t mean that the state of the houses directly causes CHD risks to be different. It might have a direct effect, or it might be that houses and roads in poor condition happen to be lived in by people who are at higher CHD risk for other reasons than the state of the buildings and roads. It’s not difficult to image how that might arise. We just can’t tell from this research even whether it’s happening.

“In the research paper, the researchers explicitly do point out the issue of cause: “…it is crucial to note that these correlations do not establish causality.” However, the press release doesn’t mention this at all (and it should have).

“Because the research is cross-sectional – that is, it looked at CHD risk and Google Street View images from more or less the same time – it also can’t directly look at how changes in what’s seen on Google Street View might be associated with changes in CHD risk. That’s just one more reason why it can’t directly tell us whether, or how, making improvements in the urban environment could change CHD risk for its inhabitants. And, as Dr Khera points out in the linked editorial, “Most large structural changes [in the built environment] occur in the context of gentrification, which merely displaces high-risk individuals [in terms of CHD risk] rather than improving their health in their original environment.”

“The issue about using data on neighbourhoods and groups of people, rather than individuals, is that neither local conditions as seen on Google Street View nor the risks of CHD in individuals will be exactly the same right across a census tract of maybe 4,000 people. The research has found correlations between average features of a census tract on Google Street View, and the average CHD risk in the tract. But that doesn’t imply that the individuals most likely to be diagnosed with CHD live in the parts of the census tract that showed up with houses looking in worse condition, or had other features in common out of the 4,096 features of the photos that the researchers’ models used. They might, but it remains possible they might not. This research can’t tell us which.

“This issue of assuming that correlations measured between averages of groups of people are reflected in correlations involving individuals is a well-known fallacy that has come up in many contexts. If you really want to be sure about what’s happening with individual people, you have to get data on individual people.

“I think the linked editorial by Rohan Khera is very good. While it shows reasonable enthusiasm about the future prospects for this kind of approach, it makes it very clear that things haven’t yet got very far. He draws attention to issues of cause and effect. He also points out that, while adding data based on the Google photos did increase the level of correlation with CHD risk above the level based only on demographic and socioeconomic characteristics, the gain was “somewhat modest”. The implication is that a considerable amount of the association between what emerged from the Goole photos and CHD risk could be due to correlations between the photo information and previously known demographic and social issues. Using the photos did add something, but not a lot, and maybe some of that extra was already associated with factors like pollution and climate that weren’t explicitly included in the new models at all.

“Dr Khera also points out that the method of validation of the statistical model based on Google Street View didn’t go further than using the data from the same census tracts, albeit in a useful way that’s pretty common in machine learning. But there’s no guarantee at all that the same models, or even the same approach, would be helpful even in other parts of the US such as smaller cities and rural communities, let alone in very different places in other parts of the world (including the UK). And he rightly draws attention to ethical issues, such as the rights of people living in the communities to have their neighbourhood’s images used for these purposes, when they didn’t explicitly consent even to having the images available for anyone to see or reuse.”



Artificial intelligence–based assessment of built environment from Google Street View and coronary artery disease prevalence’ by Zhuo Chen et al. was published in The European Heart Journal at 00.05 hrs GMT, Thursday 28 March 2024.





Declared interests

Prof McConway: I am a Trustee of the SMC and a member of its Advisory Committee.  My quote above is in my capacity as an independent professional statistician.

in this section

filter RoundUps by year

search by tag