Indirect Measurement of Left Ventricular Filling Pressure (LVFP)
Number: 0704
Table Of Contents
PolicyApplicable CPT / HCPCS / ICD-10 Codes
Background
References
Policy
Scope of Policy
This Clinical Policy Bulletin addresses indirect measurement of left ventricular filling pressure (LVFP).
-
Experimental, Investigational, or Unproven
Aetna considers the following interventions experimental, investigational, or unproven because the effectiveness of these approaches has not been established:
- Indirect measurement of left ventricular filling pressure by computerized calibration of arterial waveform response to the Valsalva maneuver (e.g., the VeriCor® System);
- EchoGo Heart Failure for detection of heart failure with preserved ejection fraction, and all other indications.
-
Related Policies
Background
- direct measurement via placement of a catheter in the left ventricle, or
- indirect measurement by placement of a catheter in the pulmonary artery to measure the pulmonary capillary wedge pressure (PCWP).
The Vericor System (CVP Diagnostics, Boston, MA) involves the indirect measurement of left ventricular filling pressure (LVFP) by analysis of arterial waveform response to the Valsalva maneuver using a proprietary algorithm. The Vericor System was cleared by the U.S. Food and Drug Administration based on a 510(k) premarket notification.
To examine the relationship and the level of accuracy of a non-invasive system in directly determining LVEDP, Sharma et al (2002) assessed LVFP by measuring PCWP and LVEDP in 57 persons followed immediately by a Valsalva maneuver using the VeriCor System during elective right and left heart catheterization. The VeriCor and PCWP measurements were then compared with results from the catheter-measured LVEDP. VeriCor measurements correlated significantly with catheter-measured LVEDP (r = 0.86), comparable to the correlation of the PCWP with catheter-measured LVEDP (r = 0.81). VeriCor measurements were within 4 mm Hg of direct LVEDP measurements 84 % of the time and within 6 mm Hg of these measurements 93 % of the time, whereas the corresponding values for PCWP were 41 % and 67 %, respectively.
According to the American College of Cardiology/American Heart Association (ACC/AHA) Practice Guidelines on Heart Failure (2001), the role of periodic invasive or non-invasive hemodynamic measurements in the management of heart failure remains uncertain. The guidelines stated that, although hemodynamic measurements can be performed by non-invasive methods, these tests have not been shown to be more valuable than routine tests, including physical examination. Moreover, it is not clear whether serial non-invasive hemodynamic measurements can be used to gauge the efficacy of treatment or to identify patients most likely to deteriorate symptomatically during long-term follow-up.
There is inadequate evidence of the clinical utility of these indirect measurements of LVFP by computerized calibration of the arterial waveform response to the Valsalva maneuver. Clinical outcome studies published in the peer-reviewed medical literature are necessary to determine the value of this test in the clinical management of patients with CHF.
In a small, prospective study, Sharma and colleagues (2011) examined if non-invasive monitoring of LVEDP would reduce re-hospitalization rates in patients hospitalized for HF. A total of 50 patients admitted for HF were randomized to management guided by daily non-invasive estimated LVEDP monitoring (group I, open) to a target LVEDP of less than 20 mm Hg or management based on clinical assessment alone without knowledge of the estimated LVEDP (group II, blinded). Non-invasive estimated LVEDP was measured by the VeriCor monitor. The primary endpoints were the reduction of estimated LVEDP at discharge and the HF re-hospitalization rate on follow-up. Estimated LVEDP was significantly reduced at discharge in the open group compared with the blinded group (mean estimated LVEDP 19.7 +/- 1.3 mm Hg versus 25.6 +/- 1.5 mm Hg, respectively, p = 0.01). The re-hospitalization rates for HF on follow-up were significantly improved in the open group compared with the blinded group (at 1 month: 0 % versus 25 %, respectively [p = 0.05]; at 3 months: 0 % versus 32 % [p = 0.01]; at 6 months: 4 % versus 36 % [p = 0.01]; at 1 year: 16 % versus 48 % [p = 0.03]). The authors concluded that when HF is managed by clinical assessment only, estimated LVEDPs remain high at discharge, resulting in early and frequent re-hospitalization for HF. Therapy guided by estimated LVEDP monitoring optimizes filling pressures and reduces HF re-hospitalization rates. Findings of this small study need to be validated by well-designed studies.
EchoGo Heart Failure
EchoGo Heart Failure is an artificial intelligence (AI)-based platform that enables precision detection of heart failure with preserved ejection fraction (HFpEF). The device employs AI to detect HFpEF from a single echocardiogram image, which accounts for 50 % of the 64 million cases of HF worldwide and has overtaken heart failure with reduced ejection fraction (HFrEF) as the most prevalent form of the deadly disease.
Wang et al (2022) noted that cardiovascular risk factors, biomarkers, and diseases are associated with poor prognosis in COVID-19 infections. Significant progress in artificial intelligence (AI) applied to cardiac imaging has recently been made. In a single-center study, these researchers examined the use of AI analytic software EchoGo in COVID-19 inpatients. A total of 50 consecutive COVID-19+ inpatients (age of 66 ± 13 years, 22 women) who had echocardiography in April 17, 2020 to August 5, 2020 were analyzed with EchoGo software, with output correlated against standard echocardiography measurements. After adjustment for the APACHE-4 score, associations with clinical outcomes were assessed. Mean EchoGo outputs were left ventricular end-diastolic volume (LVEDV) 121 ± 42 ml, end-systolic volume (LVESV) 53 ± 30 ml, ejection fraction (LVEF) 58 ± 11 %, and global longitudinal strain (GLS) −16.1 ± 5.1 %. Pearson correlation coefficients (p-value) with standard measurements were 0.810 (< 0.001), 0.873 (< 0.001), 0.528 (< 0.001), and 0.690 (< 0.001). The primary endpoint occurred in 26 (52 %) patients. Adjusting for APACHE-4 score, EchoGo LVEF and LVGLS were associated with the primary endpoint, odds ratios 0.92; (95 % confidence intervals [CI}: 0.85 to 0.99) and 1.22 (95 % CI: 1.03 to 1.45) per 1 % increase, respectively. The authors concluded that automated AI software is a new clinical tool that may assist with patient care. EchoGo LVEF and LVGLS were associated with adverse outcomes in hospitalized COVID-19 patients and can play a role in their risk stratification.
The authors stated that this study had several drawbacks. First, it was an observational, cohort, single-center study with inherent biases. Second, study power and multi-variable analyses were restrained by the number of patients and clinical events, so the APACHE-IV score was used as a surrogate to measure global clinical risk. Third, a minority of patients were excluded because of suboptimal image quality, which was to be expected for bedside TTE studies of sick COVID-19 patients, some of whom were in the intensive care unit. Fourth, the EchoGo software currently only analyzes a limited number of TTE parameters, although there is ongoing software development to expand its analytic capabilities. The Velocity Vector Imaging technique was used for standard strain measurement analysis as it is a vendor neutral method, although it is known to have a slightly lower magnitude of LVGLS than other vendors such as GE EchoPAC and may also have explained its slightly lower LVGLS values than EchoGo. These researchers also focused on evaluating associations between in-hospital outcomes and TTE, including EchoGo measurements, rather than longer-term outcomes beyond hospital discharge, where further research is needed.
O'Driscoll et al (2022) examined if LVEF and global longitudinal strain (GLS), automatically calculated by AI, would increase the diagnostic performance of stress echocardiography (SE) for coronary artery disease (CAD) detection. In a multi-center study, SEs from 512 subjects who underwent a clinically indicated SE (with or without contrast) for the evaluation of CAD from 7 hospitals in the U.K. and U.S. were studied. Visual wall motion scoring (WMS) was carried out to identify inducible ischemia. Furthermore, SE images at rest and stress underwent AI contouring for automated calculation of AI-LVEF and AI-GLS (apical 2 and 4 chamber images only) with Ultromics EchoGo Core 1.0. Receiver operator characteristic (ROC) curves and multi-variable risk models were used to examine accuracy for identification of participants subsequently found to have CAD on angiography. Participants with significant CAD were more likely to have abnormal WMS, AI-LVEF, and AI-GLS values at rest and stress (all p < 0.001). The areas under the receiver operating characteristics (AUC) for WMS index, AI-LVEF, and AI-GLS at peak stress were 0.92, 0.86, and 0.82, respectively, with cut-offs of 1.12, 64 %, and -17.2 %, respectively. Multi-variable analysis showed that addition of peak AI-LVEF or peak AI-GLS to WMS significantly improved model discrimination of CAD [C-statistic (bootstrapping 2.5th, 97.5th percentile)] from 0.78 (0.69 to 0.87) to 0.83 (0.74 to 0.91) or 0.84 (0.75 to 0.92), respectively. The authors concluded that they have demonstrated that automated AI quantification of LVEF and GLS in contrast-enhanced and unenhanced SE images was feasible both at rest and with different modes of stress in a multi-center study. These measures conferred additional independent prognostic information in participants with suspected obstructive CAD, above and beyond inducible wall motion abnormalities alone. These findings supported the increased use of quantification in SE in order to improve its diagnostic performance and use in identifying and managing CAD. These investigators stated that future research is needed to examine the impact AI may have on predicting long-term adverse outcomes in patients undergoing echocardiography.
The authors stated that this study had several drawbacks. First, the method of CAD classification, whereby presence of inducible ischemia was used to determine whether participants underwent coronary angiography, introduced a case selection bias for diagnosis of significant CAD. In addition, no central reading or quality control of readers was carried out before entering data in the data bank, and the classification of CAD did not include data on fractional flow reserve. Second, additional case selection bias might also have been introduced by the retrospective nature of the RAINIER study. This resulted in high absolute area under receiver-operating characteristic curves (AUROCs) and odds ratio for diagnostic performance of WMSI but still allows relative evaluation of WMSI versus LVEF and GLS. Third, the derivation of GLS used only the apical 2- and 4-chamber views and the effect of including the 3-chamber view is unclear. Fourth, previous studies have shown that quantification of transient ischemic dilatation is an independent predictor of mortality in patients with CAD, and a marker of multi-vessel disease, whereas this study has focused on ischemic dilatation at end-diastole and end-systole and shown they are useful for identifying significant CAD. Fifth, those undergoing pre-operative assessment were excluded from this analysis and evaluation for this patient group would also be of interest. Sixth, although these researchers have demonstrated an incremental benefit of the use of LVEF and GLS in SE, they have not compared this increase in accuracy to their parallel developments in the use of AI to provide autonomous diagnostic assessment of the likelihood of CAD based on combinations of multiple parameters. These researchers are currently performing the multi-center PROspective randomized control Trial Evaluating the Use of AI in Stress echocardiography trial in 2,500 participants (PROTEUS, ISRCTN registry ID 15113915) to examine the performance of the EchoGo platform for identifying significant CAD and on the rate of unnecessary angiography and healthcare costs. Furthermore, whether the use of GLS and LVEF may have value in those who do not achieve peak stress may be of interest to study.
Upton et al (2022) examined if an AI system can be developed to automate stress echocardiography analysis and support clinician interpretation. These researchers developed an automated image processing pipeline to extract novel geometric and kinematic features from stress echocardiograms collected as part of a large, prospective, UK-based multi-center, multi-vendor study. An ensemble machine learning (ML) classifier was trained, using the extracted features, to identify patients with severe CAD on invasive coronary angiography. The model was tested in an independent US study. Acceptable classification accuracy for identification of patients with severe CAD in the training data set was achieved on cross-fold validation based on 31 unique geometric and kinematic features, with a specificity of 92.7 % and a sensitivity of 84.4 %. This accuracy was maintained in the independent validation data set. The use of the AI classification tool by clinicians increased inter-reader agreement and confidence as well as sensitivity for detection of disease by 10 % to achieve an AUROC of 0.93. The authors concluded that automated analysis of stress echocardiograms was possible using AI and provision of automated classifications to clinicians when reading stress echocardiograms could improve accuracy, inter-reader agreement, and reader confidence. Moreover, these researchers stated that further investigations are needed to prospectively examine these tools in formal randomized trials to determine their impact on patient outcomes.
The authors stated that this study had several drawbacks. First, these investigators have not compared against another diagnostic test (i.e., SE against a 2nd diagnostic test, e.g., fractional flow reserve); thus, the model could not exclude the possibility of some degree of disease in those patients who did not undergo angiography. However, based on the follow-up data, the authors knew these patients did not have acute cardiac events; thus, were appropriate to manage medically. This classification was consistent with recent randomized trials that focused on how imaging influenced clinical practice and outcome, including highlighting that routine referral for angiography may not be warranted for many patients for whom it would be better to manage medically after their imaging test. The next phase of this work is a randomized controlled trial (RCT) (PROTEUeS [PROspective randomised controlled Trial Evaluating the Use of artificial intelligence in Stress echocardiography]; which will formally examine if provision of this AI-derived guidance would affect clinical outcome and resource use, such as angiography, for the patient. Second, the training sample size was relatively small, and to avoid biased over-estimations of summary performance statistics, missing and inconclusive data were handled using routine approaches. In this manner, all cases were included as far as practicable to minimize associated biases. To ensure this did not result in over-estimation of performance, the stability of the model was tested in the independent data-set. The data-set used for testing also varied in frequency of clinical characteristics from the training data-set consistent with approaches to ensure robust, generalizable independent testing data-sets. Third, the disease classification of “severe coronary disease” was based on clinician interpretation of invasive coronary angiography (ICA), and these researchers did not have access to quantitative measures of coronary stenosis to confirm severity. Although they used an adjudication committee blinded to the SE result to confirm diagnosis, the imprecision of stenosis assessment might have reduced accuracy of training. Fourth, the model was trained to identify severe coronary disease as a “yes/no” classification. Information on angiography was available in all patients in the testing data-set who had adverse events (AEs) after SE; however, some degree of coronary disease was not excluded in those classified as “non-severe coronary disease”. In clinical application, this group would require clinical assessment to decide on need for the most appropriate management. In the future, it may be possible to develop and train further models to provide classification of disease severities. Fifth, these investigators did not differentiate between mode of stress, and future studies may be of value to understand whether bespoke models for each stressor could increase accuracy. Sixth, these researchers did not take account of ethnicity or race in the development of the model and further work could be considered to examine if incorporating this information into models could optimize them further.
In an editorial commentary on the afore-mentioned study by Upton et al (2022), Pellikka (2022) noted the following -- “Will AI replace echocardiographers? Not anytime soon. AI results must be interpreted in the context of other available echocardiographic and stress testing information. However, AI stands to increase the efficiency and reproducibility of echocardiography; we must strive to understand AI and be prepared to document its effectiveness. AI in stress echocardiography should not be regarded a threat but rather a remarkable opportunity to further enhance the value of an already extremely useful test”.
Akerman et al (2023) stated that detection of HFpEF entails integration of multiple imaging and clinical features which are often discordant or indeterminate. These researchers employed AI to analyze a single apical 4-chamber trans-thoracic echocardiogram (TTE) video clip to detect HFpEF. A 3-dimensional (3D) convolutional neural network (CNN) was developed and trained on apical 4-chamber video clips to classify patients with HFpEF (diagnosis of HF, EF of 50 % or greater, and echocardiographic evidence of increased filling pressure; cases) versus without HFpEF (EF of 50 % or greater, no diagnosis of HF, normal filling pressure; controls). Model outputs were classified as HFpEF, no HFpEF, or non-diagnostic (high uncertainty). Performance was evaluated in an independent multi-site data-set and compared to previously validated clinical scores. Training and validation included 2,971 cases and 3,785 controls (validation hold-out, 16.8 % patients), and showed excellent discrimination (AUROC: 0.97 [95 % CI: 0.96 to 0.97] and 0.95 [95 % CI: 0.93 to 0.96] in training and validation, respectively). In independent testing (646 cases, 638 controls), 94 (7.3 %) were non-diagnostic; sensitivity (87.8 %; 95 % CI: 84.5 % to 90.9 %) and specificity (81.9 %; 95 % CI: 78.2 % to 85.6 %) were maintained in clinically relevant subgroups, with high repeatability and reproducibility. Of 701 and 776 indeterminate outputs from the Heart Failure Association-Pretest Assessment, Echocardiographic and Natriuretic Peptide Score, Functional Testing (HFA-PEFF), and Final Etiology and Heavy, Hypertensive, Atrial Fibrillation, Pulmonary Hypertension, Elder, and Filling Pressure (H2FPEF) scores, the AI HFpEF model correctly re-classified 73.5 % and 73.6 %, respectively. During follow-up (median: 2.3 [inter-quartile range (IQR): 0.5 to 5.6] years), 444 (34.6 %) patients died; mortality was higher in patients classified as HFpEF by AI (HR: 1.9 [95 % CI: 1.5 to 2.4]). The authors presented a novel AI HFpEF model which, based on only a single routinely acquired TTE video clip, accurately detected HFpEF, provided fewer nondiagnostic outputs than current clinical scores, and identified patients with worse survival. These investigators stated that the use of this classifier in the screening for HFpEF, especially when their diagnosis is uncertain, has the potential to automate an accurate detection process for a complex clinical syndrome, resulting in more patients getting a correct and expeditious diagnosis.
The authors stated that this study had 2 main drawbacks. First, diagnostic details of each case were not adjudicated; thus, it was possible that some controls had sub-clinical disease, albeit representative of patients in major clinical trials. Nonetheless, an important progression for the current model is to increase capacity and validate detection of HFpEF earlier in the clinical pathway, especially when patients might have dyspnea on exertion, but not at rest (e.g., patients referred for diastolic stress testing, or invasive filling pressure measurements at rest and with exertion), or when limited echocardiographic imaging occurs earlier in the pathway (e.g., point-of-care [POC] ultrasound [US]). Second, complete matching for age was not possible; patients with HFpEF were older. However, survival analysis was age-adjusted and sensitivity analysis showed no meaningful change in interpretation in only age-matched patients. Further investigations are needed for re-calibration or updating of the model in other patient groups (e.g., increased filling pressure but no HF diagnosis, or indeterminate filling pressure assessment by TTE), validating its application in other echocardiography laboratories and in different demographic groups, as well as prospective evaluation of comparative effectiveness with clinical scores.
Woodward et al (2023) stated that SE is one of the most commonly used diagnostic imaging tests for CAD; however, it requires clinicians to visually examine scans to identify patients who may benefit from invasive investigation and treatment. EchoGo Pro provides an automated interpretation of SE based on AI image analysis. In reader studies, use of EchoGo Pro when making clinical decisions improves diagnostic accuracy and confidence. Prospective evaluation in real world practice is now important to understand the impact of EchoGo Pro on the patient pathway and outcome. The PROTEUS Trial is a randomized, 2-armed, non-inferiority, multi-center study aiming to recruit 2,500 participants from National Health Service (NHS) hospitals in the U.K. referred to SE clinics for investigation of suspected CAD. All participants will undergo a stress echocardiogram protocol as per local hospital policy. Participants will be randomized 1:1 to a control group, representing current practice, or an intervention group, in which clinicians will receive an AI image analysis report (EchoGo Pro, Ultromics Ltd) to use during image interpretation, indicating the likelihood of severe CAD. The primary outcome will be appropriateness of clinician decision to refer for coronary angiography. Secondary outcomes will evaluate other health impacts including appropriate use of other clinical management approaches, impact on variability in decision-making, patient and clinician qualitative experience and a health economic analysis. The authors concluded that the PROTEUS Trial will be the 1st study to examine the impact of introducing an AI medical diagnostic aid into the standard care pathway of patients with suspected CAD being examined with SE.
Cassianni et al (2024) noted that HFpEF accounts for approximately 50 % of diagnoses of HF and often results in hospitalization. Clinical algorithms developed for diagnosis have been used to stratify risk for HF hospitalization or death. Deep learning (DL) has been employed to the automated interpretation of echocardiograms; however, limited information exists regarding the potential of the learning models for predicting clinical outcomes. An AI model was recently developed to identify patients with HFpEF using a single apical 4-chamber video clip from a standard TTE examination. A CNN was applied to the video clip. The model comprised a series of 3D convolutional layers designed to operate on 2D videos over 2 in-plane spatial dimensions within the image frames and across the time dimension. In a retrospective, multi-center study, these researchers examined the association between the model output and other HF biomarkers, risk for HF hospitalization and cardiac mortality, and to compare its performance with 2 clinical scores: H2FPEF (heavy, hypertensive, atrial fibrillation, pulmonary hypertension, elder, and filling pressure) and HFA-PEFF (Heart Failure Association pretest assessment, echocardiography and natriuretic peptide score, functional testing, and final etiology). The model was developed to classify patients with HFpEF versus individuals without HFpEF (control subjects). Patients with HFpEF were defined according to guidelines and included a diagnosis by the treating physician within 1 year of an echocardiographic examination showing elevated LVFP. Control subjects were patients undergoing clinically indicated echocardiography who lacked these features. All patients had LVEFs of 50 % or greater. The present analysis used the 2nd version of the AI model. In the previously described independent test population consisting of 646 patients with HFpEF and 638 control subjects, the updated model produced 95 uncertain outputs (7.4 %); in the remaining 607 patients and 582 control subjects, sensitivity was 89.8 % (95 % CI: 87.5 % to 92.5 %), specificity was 86.3 % (95 % CI: 83.6 % - to 89.7 %), negative predictive value (NPV) was 89.0 % (95% CI, 87.0 % to 91.7 %), and positive predictive value (PPV) was 87.2 % (95 % CI: 84.7 % to 89.8 %). Incident HF hospitalization was obtained from electronic health record chart review using standardized definitions, using the 1st event after the echocardiographic examination. Mortality was obtained from the National Death Index, and causes of cardiac deaths were manually reviewed. End-points were adjudicated by investigators blinded to AI analysis results. Cardiac mortality and HF hospitalization were plotted accounting for death as a competing risk. The method of Fine and Gray was employed to estimate the hazard ratios (HRs) adjusted for differences in age and sex. Among 1,284 patients followed for a median of 3.4 years (IQR, 1.7years to 6.5 years), there were 252 HF hospitalizations and 540 deaths. Cardiac deaths (n = 135) were attributable to HF in 63 patients (47 %), to coronary artery disease in 55 (41 %), to valve disease in 5(4 %), to arrhythmia in 5 (4 %), and to other causes in 7 (5 %). Again adjusting for age and sex, cardiac mortality was higher in patients with positive output (HR, 5.55; 95 % CI: 3.28 to 9.37; p < 0.001); patients with an uncertain output tended to have a higher mortality (HR, 2.22; 95 % CI: 0.94 to 5.24; p = 0.07). Patients with higher continuous probability outputs showed incrementally higher risk for cardiac mortality (4th quartile versus 1st quartile: HR, 11.65; 95 % CI: 4.65 to 29.20; p < 0.0001). Application of the AI model to the non-diagnostic H2FPEF outputs (bottom) allowed the classification of all but 68 of the 776 patients (8.8 %). The AI model showed a similar relationship between output and risk for HF hospitalization in patients with and those without diagnostic H2FPEF output. Findings were similar when patients were stratified according to HFA-PEFF score. HFA-PEFF score, brain natriuretic peptide (BNP), and N-terminal pro–brain natriuretic peptide (NT-proBNP) also differed according to the AI model prediction. Few patients underwent exercise testing; differences in exercise capacity were not significantly different. The authors concluded positive model output was associated with higher risks for HF hospitalization and cardiac mortality; and patients with uncertain outputs showed intermediate risks for these end-points. HF hospitalization and cardiac mortality risk were incrementally associated with higher model probability output scores, and the AI model re-classified HF hospitalization risk in non-diagnostic clinical scores, including 91 % for H2FPEF outputs and 92 % for HFA-PEFF. This was the first AI echocardiographic model to produce outputs discriminating a specific disease (HFpEF) that are incrementally associated with risk for HF hospitalization and cardiac mortality. Moreover, these researchers stated that prospective studies are needed to confirm these retrospective findings, to externally validate the AI model’s outputs in other echocardiographic laboratories, and to understand the implications for patient management. Studies using a broad representation of HFpEF phenotypes should be undertaken to understand the generalizability of this model in a naturally heterogeneous clinical syndrome.
Akerman et al (2025) stated that AI models to identify HFpEF based on DL of echocardiograms could aid in addressing under-recognition in clinical practice; however, they require extensive validation, especially in representative and complex clinical cohorts for which they could provide most value. In a retrospective, case-control study, these researchers enrolled patients with HFpEF (cases; n = 240), and age, sex, and year of echocardiogram matched controls (n = 256). They compared the diagnostic performance (discrimination, calibration, classification, and clinical utility) and prognostic associations (mortality and HF hospitalization) between an updated AI HFpEF model (EchoGo Heart Failure v2) and existing clinical scores (H2FPEF and HFA-PEFF). The AI HFpEF model and H2FPEF score revealed similar discrimination and calibration; however, classification was higher with AI than H2FPEF and HFA-PEFF, attributable to fewer intermediate scores, due to discordant multi-variable inputs. The continuous AI HFpEF model output added information beyond the H2FPEF, and integration with existing scores increased correct management decisions. Individuals with a diagnostic positive result from AI have a 2-fold increased risk of the composite outcome. The authors concluded that these findings suggested a possible role for a combined clinical and AI approach towards the recognition of HFpEF, ultimately with the goal of reducing uncertainty in HFpEF diagnosis and ensuring timely and appropriate treatment for this high-risk population.
The authors stated that this trial had several drawbacks. First, it must be acknowledged that the clinical HFpEF syndrome has no clear and consistent definition, and a heterogenous etiology, meaning that not all HFpEF phenotypes and definitions will be captured in the current study. While the definitions adopted in this trial were consistent with recent HF guidelines, other patients who might be reasonably associated with the HFpEF syndrome might not be adopted under the current definition -- namely, patients in whom current (or prior) EF values were below 50 %; LVFP confirmed via alternative methods such as exercise echocardiography or right heart catheterization; and patients who might have been hospitalized due to HF at different clinical sites, or captured at different phases of the diagnostic pathway (e.g., POCUS or advanced HF clinics). Second, categorical outputs from all 3 models incorporated a non-diagnostic, or “intermediate” classification intended to support further confirmatory testing; thus, direct comparison between models must consider whether such intermediate classifications were due to missingness or discordance, whether this would occur in clinical implementation, and the potential diagnostic information they might contribute. For instance, NT-proBNP was missing in nearly 25 % of patients, reflecting the challenges of guideline adherence in clinical practice as well as the clinical scores that employed such laboratory markers, but could be obtained as required. When intermediate classifications were considered a relevant element in the decision-making process, this trial showed superiority of the AI HFpEF model. However, it must also be acknowledged that several patients were excluded due to poor image quality. While this degree of poor image quality was within published norms for echocardiography; nevertheless, it exerted a greater effect on the AI HFpEF model compared to multi-parametric clinical scores. Third, the continuous outcome probability of HFpEF from the AI HFpEF model, similar to the raw score values for the H2FPEF score, may provide additional information on risk beyond the dichotomized results. Fourth, further training and development are needed to ensure that the intended use population(s) are appropriately accounted for, and model improvements such as calibration in mid-range probabilities are clear areas where improvement would benefit clinical implementation (e.g., in uncertain populations); extensive retrospective and prospective validation are needed to ensure that such features provide the intended benefit to clinicians and patients.
Paton et al (2025) noted that HFpEF is a complex clinical syndrome in which signs and symptoms of HF occur despite a normal LVEF; TTE is the 1st-line imaging modality but disparities in patient pathways across the U.K. can result in delayed diagnosis and treatment. These researchers developed and validated a consistent, clinically appropriate and practical approach for reporting the echocardiographic suspicion of HFpEF. Using the Delphi method, a steering group of nine U.K. experts identified key domains for discussion and generated consensus statements relevant to the echocardiographic detection of HFpEF. Using a 4-point Likert scale, a survey including all statements was disseminated among a wider audience of healthcare professionals to determine agreement. A consensus threshold of 75 % agreement was defined as “strong” and 90 % or higher as “very strong”. A total of 34 consensus statements were generated in 7 domains: First -- challenges in the system approach to HFpEF. Second -- enhancing referral for specialist review including echocardiography. Third -- confidence in using a summary statement in an echo report. Fourth -- identifying HFpEF and its underlying etiology. Fifth -- HF awareness, training and education. Sixth -- refining multi-disciplinary team roles in decision-making. Seventh -- optimizing patient experience. Statement 7 in the survey was worded to evaluate the perceived risk that a diagnosis of HFpEF may be missed if all appropriate images and measurements are not obtained during the echocardiogram, especially in the presence of co-morbidities. There is the potential for AI to aid automated echocardiographic detection of HFpEF, supporting the development of a rapid diagnostic approach for HFpEF; however, the safety and effectiveness of such methods are yet to be reported in clinical practice. A total of 135 U.K. specialists experienced in managing HF participated in the survey, including physiologists/clinical scientists (n = 43), HF specialist nurses (n = 35), cardiologists (n = 34), general practitioners (n = 12), pharmacists (n = 4) and others (n = 7). A total of 20 of 34 (59 %) statements achieved very strong agreement, 10 of 34 (29 %) achieved strong agreement and 4 of 34 (12 %) did not meet the consensus threshold. The authors concluded that diagnosis of HFpEF requires access to essential diagnostic tools. Establishing standardized pathways for specialist assessment and referral, including TTE reporting of HFpEF, may aid in eliminating diagnostic delays and geographical disparities. Further education and awareness are important for improving detection rates, prompt referral and patient experience.
Guidelines from the American Society for Echocardiography (Nagueh, et al., 2025) state: "Further validation of this model and other [machine learning] models using invasive hemodynamics and clinical outcomes is needed to establish precision, reproducibility, and clinical relevance."
References
The above policy is based on the following references:
- Akerman AP, Al-Roub N, Angell-James C, et al. External validation of artificial intelligence for detection of heart failure with preserved ejection fraction. Nat Commun. 2025;16:2915.
- Akerman AP, Porumb M, Scott CG, et al. Automated echocardiographic detection of heart failure with preserved ejection fraction using artificial intelligence. JACC Adv. 2023;2(6):100452.
- Cassianni C, Huntley GD, Castrichini M, et al. Automated echocardiographic detection of heart failure with preserved ejection fraction using artificial intelligence is associated with cardiac mortality and heart failure hospitalization. J Am Soc Echocardiogr. 2024;37(9):914-916.
- CVP Diagnostics, Inc. VeriCor Monitor. Revolutionizing Heart Failure Care [website]. Boston, MA: CVP Diagnostics; 2010. Available at: http://www.cvpdiagnostics.com/. Accessed November 14, 2010.
- Hunt SA, Baker DW, Chin MH, et al. ACC/AHA guidelines for the evaluation and management of chronic heart failure in the adult: A report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Committee to Revise the 1995 Guidelines for the Evaluation and Management of Heart Failure). Bethesda, MD: American College of Cardiology (ACC); 2001.
- Nagueh SF, Sanborn DY, Oh JK, et al. Recommendations for the Evaluation of Left Ventricular Diastolic Function by Echocardiography and for Heart Failure With Preserved Ejection Fraction Diagnosis: An Update From the American Society of Echocardiography. J Am Soc Echocardiogr. 2025;38(7):537-569.
- O'Driscoll JM, Hawkes W, Beqiri A, et al. Left ventricular assessment with artificial intelligence increases the diagnostic accuracy of stress echocardiography. Eur Heart J Open. 2022;2(5):oeac059.
- Paton MF, Barton C, Baruah R, et al. Echocardiography reporting in heart failure with preserved ejection fraction: Delphi consensus study. Open Heart. 2025;12(1):e003063.
- Patterson RP, Zhang J. Impedance cardiographic measurement of the physiological response to the Valsalva manoeuvre. Med Biol Eng Comput. 2003;41(1):40-43.
- Pellikka PA. Artificially intelligent interpretation of stress echocardiography: The future is now. JACC Cardiovasc Imaging. 2022;15(5):728-730.
- Sharma GV, Woods PA, Lambrew CT et al. Evaluation of a noninvasive system for determining left ventricular filling pressure. Arch Intern Med. 2002;162:2084-2088.
- Sharma GV, Woods PA, Lindsey N, et al. Noninvasive monitoring of left ventricular end-diastolic pressure reduces rehospitalization rates in patients hospitalized for heart failure: A randomized controlled trial. J Card Fail. 2011;17(9):718-725.
- U.S. Food and Drug Administration (FDA), Center for Devices and Radiologic Health (CDRH). VeriCor. CVP Diagnostics, Inc. 510(k) No. K031327. Rockville, MD: FDA; June 7, 2004.
- Upton R, Mumith A, Beqiri A, et al. Automated echocardiographic detection of severe coronary artery disease using artificial intelligence. JACC Cardiovasc Imaging. 2022;15(5):715-727.
- Upton R, Strom JB. Guidelines for the digital era: A vision for artificial intelligence in HFpEF. J Am Soc Echocardiogr. 2025;38(7):636.
- Wang TKM, Cremer PC, Chan N, et al. Utility of an automated artificial intelligence echocardiography software in risk stratification of hospitalized COVID-19 patients. Life (Basel). 2022;12(9):1413.
- Woodward G, Bajre M, Bhattacharyya S, et al. PROTEUS Study: A prospective randomized controlled trial evaluating the use of artificial intelligence in stress echocardiography. Am Heart J. 2023;263:123-132.
