select search filters
roundups & rapid reactions
before the headlines
Fiona fox's blog

expert reaction to study on AI for the detection of neurodevelopmental disorders

A study, published in PNAS, reports the use of AI to detect a potential biomarker for the early stages of neurodevelopmental disorders, including autism. 


Prof Dorothy Bishop, Professor of Developmental Neuropsychology, University of Oxford, said:

“This is a complex study which includes a range of experiments on genetically-modified mice. I shall focus on the implications for humans, which are emphasised in the Abstract and the Significance statement. The claim is that a measure of variation in spontaneous arousal, developed in mouse models, could be used as a ‘biomarker for the early detection of developmental disorders’, particularly autism.

“These conclusions go wildly beyond anything demonstrated in the article. The part of the study concerned with humans compared 35 girls with Rett syndrome and 40 typically developing control girls. Rett syndrome is a rare genetic disorder affecting around 1 in 10,000 females, in which there is developmental regression, loss of motor skills, stereotypical hand movements, and some associated autistic features. The paper was not easy to follow but seemed to show that heart rate shows more spontaneous fluctuations in girls with Rett syndrome than in control girls, which is different from what has been observed in autism, where it is more usual to see reduced heart rate variability.

“There are two aspects of the report that are of particular concern. First, it makes claims about the ability of the test to detect ‘neurodevelopmental spectrum disorders such as autism’, when in fact the focus is on a rare genetic condition that results in some autistic features in the context of developmental regression. This is misleading. Second, to make a far more restricted claim that the arousal measure might be used for early detection of Rett syndrome, the researchers would need to show that the arousal variability measure had good sensitivity and specificity for detecting the disorder in the general population. The detailed data on accuracy of the measure is relegated to Supplementary material that is not available to me prior to publication, but given that the prevalence of Rett syndrome is 1 in 10,000 females, it is simply not feasible that this measure would provide useful information, over and above a genetic test to identify the causal mutation.”


Dr Franz Kiraly, UCL and Turing Fellow in the Department for Statistical Science, Alan Turing Institute, said:

“A problem with the paper is that descriptions of methodology are somewhat vague – especially how the AI algorithms are applied. There are likely critical technical issues, and further major issues, that may render the entire work unreliable. A conclusive judgment whether the issues are indeed critical may be obtained from reviewing the code – the researchers have not shared it, or the data.

“For the “mouse” experiment, the authors report 97% accuracy on classifying 36 mice – 30 wild type (“healthy”) and 6 knock-out mice (“sick”). The width of a 95% confidence interval on that number (36 mice) is about 25%, and on the 25% on the hold-out set (9 mice), it increases to about 50%. This means that 97% accuracy on such a small number of sick mice can very plausibly be explained by randomness.

“Other factors are a potential problem. It doesn’t look like the authors discuss these, or account for them. For example: how do we know that the neural networks didn’t base their discrimination on something that wasn’t to do with the pupils or the heart rate at all?

“The intended stage of this study is probably “proof-of-concept”, though due to the small data size and potential issues this is more in the ideas stage. What the research would need is corroboration on a larger data set, independent code review, and perhaps an entire independent replication.”

 “The conclusions paragraph of the study is full of speculation. For example, the major conclusion that these results illustrate the utility of this approach as a first-pass screening tool warning of impending neurodevelopmental abnormalities for timely intervention is incorrect under any circumstances. Nothing in the research illustrates utility for early warning and intervention of neurodevelopmental issues, as it is not studied and evaluated prospectively as a clinical tool. There is a further claim in the “significance” box that the study was successfully applied to (human) autism spectrum disorders (ASD). This is false since all human patients were suffering from Rett syndrome, a rarer condition not representative of ASD. Results are hence not directly transferable to autism. Even if the study would be applicable, given its technical state, a clinically applicable test isn’t on the horizon and should be cautioned against.

“A summary:

  • The study is not representative for autism spectrum disorders – at best this gives indications for the rarer Rett symdrome (not a, or not a typical ASD).
  • There are severe technical issues with the work, which make many of the claims questionable.
  • There may be ethical issues as well, should the authors attempt to disseminate their work as a “miracle cure” before the issues are addressed.”


Prof David Curtis, Honorary Professor, UCL Genetics Institute, said:

“The study shows that mice with genetic abnormalities designed to mimic those which can cause autism in humans tended to spend more time in a more aroused state, as measured by having enlarged pupils for much of the time. Likewise, children with a diagnosis of Rett syndrome, in which autistic features can occur, also spent more time in an aroused state, as measured by longer periods of increased heart rate. Looking at their results, these differences were quite marked. They used a neural network to recognise these differences but I’m not sure that other, much simpler, methods might not have performed just as well.”

“There is a suggestion that in some way a mouse model mimics human Rett syndrome but I think all we are really seeing is a tendency for increased arousal to be seen in both conditions.”

“Although the authors suggest that such approaches might be used to produce earlier diagnosis of Rett syndrome it’s hard to see how this would work in practice. The heart rate measurements involved attaching the child to an ECG for an hour at a time. I’m not sure one would want to do this to one year old children on a routine basis. If a diagnosis of Rett syndrome were suspected then one could simply do a genetic test instead. We should also remember that Rett syndrome is an extremely rare and unusual form of autism. We don’t know if similar results would be found in more typical forms of autism.”


Dr Cathy Fernandes, Lecturer & Programme Leader MSc Genes, Environment & Development, King’s College London (KCL), said:

“This article presents interesting research into a potential non-invasive, measurable indicator (a biomarker) to help diagnose autism spectrum disorders early in life before key periods of brain development have occurred and behavioural intervention becomes less effective. Finding early biomarkers for early diagnosis of autism would make an important impact on families if they need to access healthcare and related support services in a timely manner.

“The research takes a large leap in translating the results from a small mouse study to suggesting a biomarker for all types of autism. The authors suggest that measuring spontaneous changes in pupil diameter and heart rate in response to arousal are robust and sensitive biomarkers for the early detection of autism. However, we should be very cautious translating the findings from this mouse study to human for a number of reasons.

  1. All the mice studied were already equivalent to adolescents/young adults so the findings don’t support the value of this measure as a very early predictor of a neurodevelopmental condition such as autism.
  2. Pupil diameter changes in response to arousal are not thought to be related to light but there may still be a problem with translating measures in mice which are nocturnal animals with poor visual ability. Smell and touch are far more important senses in mouse than vision and so it may be more relevant to study changes in these senses in response to arousal than vision.
  3. The research was conducted on very small numbers of mice (less than ten mice per group with a mix of male and female mice so some results could be confounded by sex differences)
  4. The genetically modified mice studied are models of rare, syndromic forms of autism so not representative of the majority of autism spectrum disorders.
  5. The study needs to be independently replicated before robust or reliable conclusions can be made regarding the value of this proposed biomarker.”


Prof Duc Pham, Chance Professor of Engineering and Head of School, University of Birmingham, said:

“The idea of this research is to reuse the learning of one task for another task.  In this case, the learning of the signs of autism (abnormalities in pupil fluctuations) in mice was the starting point for learning to recognise Rett syndrome in humans. The authors have also demonstrated how a system for detecting autism in mice using pupil fluctuation information could be readily retrained to identify Rett syndrome in humans from heart rate variation data.

“The key assumptions in both situations are that there are similarities between the symptoms of Rett syndrome in humans and autism in mice and correlation between one measurement modality (pupil fluctuations) and another (heart rate variations).  Provided that those assumptions hold, the problem can be reduced to that of applying transfer learning, an effective method of speeding up and/or increasing the accuracy of machine learning.  However, due to possible confusion with autism or cerebral palsy, it is unlikely that the proposed machine learning technique can by itself reliably identify Rett syndrome without additional diagnostic procedures.”


* ‘Deep learning of spontaneous arousal fluctuations detects early cholinergic defects across neurodevelopmental mouse models and patients’ by Artonia et al. was published in PNAS at 20:00 UK time on Monday 22nd July. 

DOI: 10.1073/pnas.1820847116


Declared interests

Prof Dorothy Bishop: “I have no conflict of interest.”

Dr Franz Kiraly: “I’m affiliated with the Alan Turing Institute, whose health programme works on related topics (though I’m not directly involved in the mental health projects).”

Dr David Curtis: “I have no conflict of interest.”

Dr Cathy Fernandez: “I declare I do not have any conflicting interests with my role as an SMC expert in this story.”

None others received.

in this section

filter RoundUps by year

search by tag