Automated Audiometry

Number: 0870


Aetna considers automated audiometry that is either self-administered or administrated by non-audiologists experimental and investigational because its effectiveness has not been adequately validated to be equivalent to audiometry performed by an audiologist.


A limited number of studies have compared computer-assisted audiometry that is self-administered or administered by non-audiologists to audiometry administered by an audiologist. 

Mahomed et al (2013) conducted a meta-analysis of studies reporting within-subject comparisons of manual and automated threshold audiometry.  The authors found overall average differences between manual and automated air conduction audiometry to be comparable with test-retest differences for manual and automated audiometry.  The authors found, however, limited data on automated audiometry in children and difficult-to-test populations, automated bone conduction audiometry, and data on the performance of automated audiometry in different types and degrees of hearing loss.

The American Speeh-Language Hearing Association (2013) recommends that hearing screening be conducted under the supervision of an audiologist holding the ASHA Certificate of Clinical Competence (CCC).

In a prospective diagnostic study, Foulad et al (2103) determined the feasibility of an Apple iOS-based automated hearing testing application and compared its accuracy with conventional audiometry.  An iOS-based software application was developed to perform automated pure-tone hearing testing on the iPhone, iPod touch, and iPad.  To assess for device variations and compatibility, preliminary work was performed to compare the standardized sound output (dB) of various Apple device and headset combinations.  A total of 42 subjects underwent automated iOS-based hearing testing in a sound booth, automated iOS-based hearing testing in a quiet room, and conventional manual audiometry.  The maximum difference in sound intensity between various Apple device and headset combinations was 4 dB.  On average, 96 % (95 % confidence interval [CI]: 91 % to 100 %) of the threshold values obtained using the automated test in a sound booth were within 10 dB of the corresponding threshold values obtained using conventional audiometry.  When the automated test was performed in a quiet room, 94 % (95 % CI: 87 % to 100 %) of the threshold values were within 10 dB of the threshold values obtained using conventional audiometry.  Under standardized testing conditions, 90 % of the subjects preferred iOS-based audiometry as opposed to conventional audiometry.  The authors concluded that Apple iOS-based devices provided a platform for automated air conduction audiometry without requiring extra equipment and yielded hearing test results that approach those of conventional audiometry.  This was a feasibility study; its findings need to be validated by well-designed studies.

Khoza-Shangase and Kassner (2013) determined the accuracy of UHear™, a downloadable audiometer on to an iPod Touch©, when compared with conventional audiometry.  Participants were primary school students.  A total number of 86 participants (172 ears) were included.  Of these 86 participants, 44 were females and 42 were males; with the age ranging from 8 years to 10 years (mean age of 9.0 years). Each participant underwent 2 audiological screening evaluations; one by means of conventional audiometry and the other by means of UHear™.  Otoscopy and tympanometry was performed on each participant to determine status of their outer and middle ear before each participant undergoing pure tone air conduction screening by means of conventional audiometer and UHear™.  The lowest audible hearing thresholds from each participant were obtained at conventional frequencies.  Using the paired t-test, it was determined that there was a significant statistical difference between hearing screening thresholds obtained from conventional audiometry and UHear™.  The screening thresholds obtained from UHear™ were significantly elevated (worse) in comparison to conventional audiometry.  The difference in thresholds may be attributed to differences in transducers used, ambient noise levels and lack of calibration of UHear™.  The authors concluded that the UHear™ is not as accurate as conventional audiometry in determining hearing thresholds during screening of school-aged children.  Moreover, they stated that caution needs to be exercised when using such measures and research evidence needs to be established before they can be endorsed and used with the general public.

In a Cochrane review, Barker et al (2014) stated that acquired adult-onset hearing loss is a common long-term condition for which the most common intervention is hearing aid fitting. However, up to 40 % of people fitted with a hearing aid either fail to use it or may not gain optimal benefit from it. These investigators evaluated the long-term effectiveness of interventions to promote the use of hearing aids in adults with acquired hearing loss fitted with at least 1 hearing aid. The authors concluded that there is some low to very low quality evidence to support the use of self-management support and complex interventions combining self-management support and delivery system design in adult auditory rehabilitation. However, effect sizes were small and the range of interventions that had been tested was relatively limited.

In a 2-phase correlational study, Convery et al (2015) evaluated the reliability and validity of an automatic audiometry algorithm that is fully implemented in a wearable hearing aid, to determine to what extent reliability and validity are affected when the procedure is self-directed by the user, and to investigate contributors to a successful outcome. A total of 60 adults with mild-to-moderately severe hearing loss participated in both studies: 20 in Study 1 and 40 in Study 2; 27 participants in Study 2 attended with a partner. Participants in both phases were selected for inclusion if their thresholds were within the output limitations of the test device. In both phases, participants performed automatic audiometry through a receiver-in-canal, behind-the-ear hearing aid coupled to an open dome. In Study 1, the experimenter directed the task. In Study 2, participants followed a set of written, illustrated instructions to perform automatic audiometry independently of the experimenter, with optional assistance from a lay partner. Standardized measures of hearing aid self-efficacy, locus of control, cognitive function, health literacy, and manual dexterity were administered. Statistical analysis examined the repeatability of automatic audiometry; the match between automatically and manually measured thresholds; and contributors to successful, independent completion of the automatic audiometry procedure. When the procedure was directed by an audiologist, automatic audiometry yielded reliable and valid thresholds. Reliability and validity were negatively affected when the procedure was self-directed by the user, but the results were still clinically acceptable: test-retest correspondence was 10 dB or lower in 97 % of cases, and 91 % of automatic thresholds were within 10 dB of their manual counterparts. However, only 58 % of participants were able to achieve a complete audiogram in both ears. Cognitive function significantly influenced accurate and independent performance of the automatic audiometry procedure; accuracy was further affected by locus of control and level of education. Several characteristics of the automatic audiometry algorithm played an additional role in the outcome. The authors concluded that average transducer- and coupling-specific correction factors are sufficient for a self-directed in-situ audiometry procedure to yield clinically reliable and valid hearing thresholds. Before implementation in a self-fitting hearing aid, however, the algorithm and test instructions should be refined in an effort to increase the proportion of users who are able to achieve complete audiometric results. They stated that further evaluation of the procedure, particularly among populations likely to form the primary audience of a self-fitting hearing aid, should be undertaken.

Levit and colleagues (2015) estimated the rate of hearing loss detected by first-stage oto-acoustic emissions test but missed by second -stage automated auditory brainstem response (ABR) testing. The data of 17,078 infants who were born at Lis Maternity Hospital between January 2013 and June 2014 were reviewed.  Infants who failed screening with a transient evoked oto-acoustic emissions (TEOAE) test and infants admitted to the NICU for more than 5 days underwent screening with an automated ABR test at 45 decibel hearing level (dB HL).  All infants who failed screening with TEOAE were referred to a follow-up evaluation at the hearing clinic.  A total of 24 % of the infants who failed the TEOAE and passed the automated ABR hearing screening tests were eventually diagnosed with hearing loss by diagnostic ABR testing (22/90).  They comprised 52 % of all of the infants in the birth cohort who were diagnosed with permanent or persistent hearing loss 0.25 dB HL in 1 or both ears (22/42).  Hearing loss 0.45 dB HL, which is considered to be in the range of moderate-to-profound severity, was diagnosed in 36 % of the infants in this group (8/22), comprising 42 % of the infants with hearing loss of this degree (8/19).  The authors concluded that the sensitivity of the diverse response detection methods of automated ABR devices needs to be further empirically evaluated.

Brennan-Jones and associates (2016) examined the accuracy of automated audiometry in a clinically heterogeneous population of adults using the KUDUwave automated audiometer. Manual audiometry was performed in a sound-treated room and automated audiometry was not conducted in a sound-treated environment.  A total of 42 consecutively recruited participants from a tertiary otolaryngology department in Western Australia.  Absolute mean differences ranged between 5.12 to 9.68 dB (air-conduction) and 8.26 to 15 dB (bone-conduction).  A total of 86.5 % of manual and automated 4FAs were within 10 dB (i.e., ±5 dB); 94.8 % were within 15 dB.  However, there were significant (p < 0.05) differences between automated and manual audiometry at 250, 500, 1,000, and 2,000 Hz (air-conduction) and 500 and 1,000 Hz (bone-conduction).  The effect of age (greater than or equal to 55 years) on accuracy (p = 0.014) was not significant on linear regression (p > 0.05; R(2) = 0.11).  The presence of a hearing loss (better ear greater than or equal to 26 dB) did not significantly affect accuracy (p = 0.604; air-conduction), (p = 0.218; bone-conduction).  The authors concluded that the findings of this study provided clinical validation of automated audiometry using the KUDUwave in a clinically heterogeneous population, without the use of a sound-treated environment.  They stated that while threshold variations were statistically significant, future research is needed to ascertain the clinical significance of such variation.

In a pilot study, Brennan-Jones and colleagues (2017) examined the diagnostic accuracy of automated audiometry in adults with hearing loss in an asynchronous tele-health model using pre-defined diagnostic protocols. These researchers recruited 42 study participants from a public audiology and otolaryngology clinic in Perth, Western Australia.  Manual audiometry was performed by an audiologist either before or after automated audiometry.  Diagnostic protocols were applied asynchronously for normal hearing, disabling hearing loss, conductive hearing loss and unilateral hearing loss.  Sensitivity and specificity analyses were conducted using a 2-by-2 matrix and Cohen's kappa was used to measure agreement.  The overall sensitivity for the diagnostic criteria was 0.88 (range of 0.86 to 1) and overall specificity was 0.93 (range of 0.86 to 0.97).  Overall kappa (k) agreement was "substantial" k = 0.80 (95 % CI: 0.70 to 0.89) and significant at p < 0.001.  The authors concluded that pre-defined diagnostic protocols applied asynchronously to automated audiometry provide accurate identification of disabling, conductive and unilateral hearing loss.  They stated that this method has the potential to improve synchronous and asynchronous tele-audiology service delivery.

In a prospective, cross-over, equivalence study, Whitton and associates (2016) compared hearing measurements made at home using self-administered audiometric software against audiological tests performed on the same subjects in a clinical setting. In experiment 1, adults with varying degrees of hearing loss (n = 19) performed air-conduction audiometry, frequency discrimination, and speech recognition in noise testing twice at home with an automated tablet application and twice in sound-treated clinical booths with an audiologist.  The accuracy and reliability of computer-guided home hearing tests were compared to audiologist administered tests.  In experiment 2, the reliability and accuracy of pure-tone audiometric results were examined in a separate cohort across a variety of clinical settings (n = 21).  Remote, automated audiograms were statistically equivalent to manual, clinic-based testing from 500 to 8,000 Hz (p ≤ 0.02); however, 250 Hz thresholds were elevated when collected at home.  Remote and sound-treated booth testing of frequency discrimination and speech recognition thresholds were equivalent (p ≤ 5 × 10-5 ).  In the second experiment, remote testing was equivalent to manual sound-booth testing from 500 to 8,000 Hz (p ≤ 0.02) for a different cohort who received clinic-based testing in a variety of settings.  The authors concluded that these data provided a proof of concept that several self-administered, automated hearing measurements are statistically equivalent to manual measurements made by an audiologist in the clinic.  The demonstration of statistical equivalency for these basic behavioral hearing tests points toward the eventual feasibility of monitoring progressive or fluctuant hearing disorders outside of the clinic to increase the efficiency of clinical information collection.

Masalski and colleagues (2016) noted that hearing tests performed in the home setting by means of mobile devices require previous calibration of the reference sound level.  Mobile devices with bundled headphones create a possibility of applying the pre-defined level for a particular model as an alternative to calibrating each device separately.  These investigators determined the reference sound level for sets composed of a mobile device and bundled headphones.  Reference sound levels for Android-based mobile devices were determined using an open access mobile phone application by means of biological calibration, i.e., in relation to the normal-hearing threshold.  The examinations were conducted in 2 groups:
  1. an uncontrolled, and
  2. a controlled one.

In the uncontrolled group, the fully automated self-measurements were performed in home conditions by 18- to 35-year old subjects, without prior hearing problems, recruited online.  Calibration was conducted as a preliminary step in preparation for further examination.  In the controlled group, audiologist-assisted examinations were performed in a sound booth, on normal-hearing subjects verified through pure-tone audiometry, recruited offline from among the workers and patients of the clinic.  In both the groups, the reference sound levels were determined on a subject's mobile device using the Bekesy audiometry.  The reference sound levels were compared between the groups.  Intra-model and inter-model analyses were performed as well.  In the uncontrolled group, 8,988 calibrations were conducted on 8,620 different devices representing 2,040 models.  In the controlled group, 158 calibrations (test and re-test) were conducted on 79 devices representing 50 models.  Result analysis was performed for 10 most frequently used models in both the groups.  The difference in reference sound levels between uncontrolled and controlled groups was 1.50 dB (SD 4.42).  The mean SD of the reference sound level determined for devices within the same model was 4.03 dB (95 % CI: 3.93 to 4.11).  Statistically significant differences were found across models.  The authors concluded that reference sound levels determined in the uncontrolled group were comparable to the values obtained in the controlled group.  This validated the use of biological calibration in the uncontrolled group for determining the pre-defined reference sound level for new devices.  Moreover, due to a relatively small deviation of the reference sound level for devices of the same model, it was feasible to conduct hearing screening on devices calibrated with the pre-defined reference sound level.  Moreover, these researchers stated that the method presented in this study could be applied in screening hearing examinations on a large scale with the use of popular mobile devices sold with bundled headphones.  Due to rapidly growing market of mobile devices, the main advantage of the method is the semi-automated calibration of new models.  Pre-defined reference sound level for a new model may be determined on the basis of a biological calibration conducted by the first users of devices.  They stated that to confirm the estimated accuracy of the method, it is advisable to conduct a direct comparison of pure-tone audiometry and a hearing test on mobile devices calibrated biologically by means of the pre-defined reference sound level.

In a prospective study, Saliba and colleagues (2017)
  1. compared the accuracy of 2 previously validated mobile-based hearing tests in determining pure tone thresholds and screening for hearing loss, and
  2. determined the accuracy of mobile audiometry in noisy environments through noise reduction strategies.

A total of 33 adults with or without hearing loss were tested (mean age of 49.7 years; women, 42.4 %).  Air conduction thresholds measured as pure tone average and at individual frequencies were assessed by conventional audiogram and by 2 audiometric applications (consumer and professional) on a tablet device.  Mobile audiometry was performed in a quiet sound booth and in a noisy sound booth (50 dB of background noise) through active and passive noise reduction strategies.  On average, 91.1 % (95 % CI: 89.1 % to 93.2 %) and 95.8 % (95 % CI: 93.5 % to 97.1 %) of the threshold values obtained in a quiet sound booth with the consumer and professional applications, respectively, were within 10 dB of the corresponding audiogram thresholds, as compared with 86.5 % (95 % CI: 82.6 % to 88.5 %) and 91.3 % (95 % CI: 88.5 % to 92.8 %) in a noisy sound booth through noise cancellation.  When screening for at least moderate hearing loss (pure tone average greater than 40 dB HL), the consumer application showed a sensitivity and specificity of 87.5 % and 95.9 %, respectively, and the professional application, 100 % and 95.9 %.  Overall, patients preferred mobile audiometry over conventional audiograms.  The authors concluded that mobile audiometry could correctly estimate pure tone thresholds and screen for moderate hearing loss; noise reduction strategies in mobile audiometry provided a portable effective solution for hearing assessments outside clinical settings.  This was a small (n = 33) study; its findings need to be validated by well-designed studies.

Furthermore, UpToDate reviews on "Evaluation of hearing loss in adults" (Weber, 2017) and "Hearing impairment in children: Evaluation" (Smith and Gooi, 2017) do not mention automated audiometry as a management tool.

Brennan-Jones and co-workers (2018) stated that remote interpretation of automated audiometry offers the potential to enable asynchronous tele-audiology assessment and diagnosis in areas where synchronous tele-audiometry may not be possible or practical.  These researchers compared remote interpretation of manual and automated audiometry.  A total of 5 audiologists each interpreted manual and automated audiograms obtained from 42 patients.  The main outcome variable was the audiologist's recommendation for patient management (which included treatment recommendations, referral or discharge) between the manual and automated audiometry test.  Cohen's Kappa and Krippendorff's Alpha were used to calculate and quantify the intra- and inter-observer agreement, respectively, and McNemar's test was used to assess the audiologist-rated accuracy of audiograms.  Audiograms were randomized and audiologists were blinded as to whether they were interpreting a manual or automated audiogram.  Intra-observer agreement was substantial for management outcomes when comparing interpretations for manual and automated audiograms.  Inter-observer agreement was moderate between clinicians for determining management decisions when interpreting both manual and automated audiograms.  Audiologists were 2.8 times more likely to question the accuracy of an automated audiogram compared to a manual audiogram.  The authors concluded that there is a lack of agreement between audiologists when interpreting audiograms, whether recorded with automated or manual audiometry.  The main variability in remote audiogram interpretation was likely to be individual clinician variation, rather than automation.

Govender and colleagues (2018) noted that asynchronous automated telehealth-based hearing screening and diagnostic testing can be used within the rural school context to identify and confirm hearing loss.  These investigators evaluated the efficacy of an asynchronous telehealth-based service delivery model using automated technology for screening and diagnostic testing as well as to describe the prevalence, type and degree of hearing loss.  A comparative within-subject design was used.  Frequency distributions, sensitivity, specificity scores as well as the positive and negative predictive values (PPV and NPV) were calculated.  Testing was conducted in a non-sound-treated classroom within a school environment on 73 participants (146 ears).  The sensitivity and specificity rates were 65.2 % and 100 %, respectively.  Diagnostic accuracy was 91.7 % and the NPV and PPV were 93.8 % and 100 %, respectively.  Results revealed that 23 ears of 20 participants (16 %) presented with hearing loss; 12 % of ears presented with unilateral hearing impairment and 4 % with bilateral hearing loss.  Mild hearing loss was identified as most prevalent (8 % of ears); 8 ears obtained false-negative results and presented with mild low- to mid-frequency hearing loss.  The sensitivity rate for the study was low and was attributed to plausible reasons relating to test accuracy, child-related variables and mild low-frequency sensory-neural hearing loss.  The authors concluded that the findings of this study demonstrated that asynchronous telehealth-based automated hearing testing within the school context could be used to facilitate early identification of hearing loss; however, further research and development into protocol formulation, ongoing device monitoring and facilitator training is needed to improve test sensitivity and ensure accuracy of results.

Shojaeemend and Ayatollahi (2018) reviewed studies related to automated audiometry by focusing on the implementation of an audiometer, the use of transducers and evaluation methods.  This review study was carried out in 2017.  The papers related to the design and implementation of automated audiometry were searched in the following databases: Science Direct, Web of Science, PubMed, and Scopus.  The time frame for the papers was between January 1, 2010 and August 31, 2017.  A total of 143 papers were found, and after screening, the number of papers was reduced to 16.  The findings showed that the implementation methods were categorized into the use of software (7 papers), hardware (3 papers) and smartphones/tablets (6 papers).  The used transducers were a variety of earphones and bone vibrators.  Different evaluation methods were used to evaluate the accuracy and the reliability of the diagnoses.  However, in most studies, no significant difference was found between automated and traditional audiometry.  The authors concluded that automated audiometry produced clinically acceptable results compared with traditional audiometry.  The 2 main advantages of automated audiometry are saving costs and improving accessibility to hearing care, which can lead to a cost-effective and rapid diagnosis of hearing impairment, especially in poor areas.  The use of automated audiometry may have some challenges, such as measuring the impact of environmental noise on the test results, recording bone-conduction hearing thresholds with the possibility of generating occlusion effects by the earphones, and ensuring the quality of the automated audiometry test results.  These researchers stated that further studies need to be conducted to compare the characteristics of different computerized solutions and related challenges for automated audiometry.  Because the performance of transducers are different, evaluation studies are needed to compare their performance to be able to choose the best one for automated audiometry.

The authors stated that this study had several drawbacks.  Due to the limitation of smartphones in generating different audio frequencies and intensities, these applications could only be used for general screening programs when traditional audiometry tests are not available.  Another limitation was about sound calibration.  Unlike an audiometer, the output sound of smartphones is not calibrated, and it may not meet the requirements of audiometry.  Moreover, the hardware of smartphones and audiometers is different, and the accuracy of the results should be examined.  These researchers stated that more studies are needed to identify the strengths and limitations of computerized solutions for automated audiometry to be able to design more effective solutions in the future.

Pereira and associates (2018) noted that very few studies have examined if tablet-based automated audiometry could offer a valid alternative to traditional manual audiometry for estimation of hearing thresholds in children.  This study examined the validity and efficiency of automated audiometry in school-aged children.  Hearing thresholds for 0.5, 1, 2, 4, 6, and 8 kHz were collected in 32 children aged 6 to 12 years using standard audiometry and tablet-based automated audiometry in a sound-proof booth.  Test administration time, test preference, and medical history were also collected.  Results exhibited that the majority (67 %) of threshold differences between automated and standard were within the clinically acceptable range (10 dB).  The threshold difference between the 2 tests showed that automated audiometry thresholds were higher by 12 dB in 6-year olds, 7 dB in 7- to 9-year olds, and 3 dB in 10- to 12-year olds.  In addition, test administration times were similar, such that standard audiometry took an average of 12.3 mins and automated audiometry took 11.9 mins.  The authors concluded that these results supported the use of tablet-based automated audiometry in children from ages 7 to 12 years.  However, the results suggested that the clinical use of at least some types of tablet-based automated audiometry may not be feasible in children 6 years of age.

Samelli and colleagues (2020) examined the performance of a tablet-based tele-audiometry method for automated hearing screening of schoolchildren through a comparison of the results of various hearing screening approaches.  A total of 244 children were evaluated; tablet-based screening results were compared with gold-standard pure-tone audiometry.  Acoustic immittance measurements were also conducted.  To pass the tablet-based screening, the children were required to respond to at least 2 out of 3 sounds for all the frequencies in each ear.  Several hearing screening methods were analyzed: exclusively tablet-based (with and without 500-Hz checked) and combined tests (series and parallel).  The sensitivity, specificity, PPV, NPV and accuracy were calculated.  A total of 9.43 % of children presented with mild-to-moderate conductive hearing loss (unilateral or bilateral).  Diagnostic values varied among the different hearing screening approaches that were evaluated: sensitivities ranged from 60 to 95 %, specificities ranged from 44 to 91 %, PPVs ranged from 15 to 44 %, NPVs ranged from 95 to 99 %, accuracy values ranged from 49 to 88 %, and area under curve (AUC) values ranged from 0.690 to 0.883.  Regarding diagnostic values, the highest results were found for the tablet-based screening method and for the series approach.  Compared with the results obtained by conventional audiometry and considering the diagnostic values of the different hearing screening approaches, the highest diagnostic values were generally obtained using the automated hearing screening method (including 500-Hz).  The authors concluded that this application, which was developed for the tablet computer, was shown to be a valuable hearing screening tool for use with schoolchildren.  These researchers suggested that this hearing screening protocol has the potential to improve asynchronous tele-audiology service delivery.

Colsman and colleagues (2020) noted that quantifying hearing thresholds via mobile self-assessment audiometric applications has been demonstrated repeatedly with heterogenous results regarding the accuracy.  One important limitation of several of these applications has been the lack of appropriate calibration of their core technical components (sound generator and headphones).  These researchers examined the accuracy and reliability of a calibrated application (app) for pure-tone screening audiometry by self-assessment on a tablet computer: Audimatch app installed on Apple iPad 4 in combination with Sennheiser HDA-280 headphones.  In a repeated-measures design audiometric thresholds collected by the app were compared to those obtained by standardized automated audiometry and additionally test-retest reliability was evaluated.  A total of 68 subjects aged 19 to 65 years with normal hearing were tested in a sound-attenuating booth.  An equivalence test revealed highly similar hearing thresholds for the app compared with standardized automated audiometry.  A test-retest reliability analysis within each method showed a high correlation coefficient for the app (Spearman rank correlation: rho = 0.829) and for the automated audiometer (rho = 0.792).  The results implied that the self-assessment of audiometric thresholds via a calibrated mobile device represented a valid and reliable alternative for stationary assessment of hearing loss thresholds, supporting the potential usability within the area of occupational health care.

The authors stated that this study had several drawbacks.  Test sessions were conducted in a sound-insulated booth; thus, it was not evident whether results could be compared to the measurement of hearing thresholds in a noisy surrounding, like in a standard office.  Therefore, field studies with environmental noise could provide more insight on the accuracy and validity of the audiometric thresholds gathered by the app (e.g., in a waiting room of an otolaryngologist).  More importantly, a validation with audiologically impaired patients would be necessary for the estimation of sensitivity and specificity of the app for clinically relevant hearing loss.  Furthermore, the audiometric application was designed for self-assessment.  However, even though the whole audiometric procedure can be performed by the subject, the system is not intended for the use in private homes, outside the range of a trained person, as special headphones are needed and regular calibration of the iPad/headphone combination is a prerequisite.  Apart from the calibration of the system, which has to be performed by a specialized company, this audiometric screening test could be operated by the user.  Some supervision by trained personnel is helpful when starting the app, but it did not require the guidance of a health care professional.  This also stood in contrast to the operation of the automated audiometer (which was used for comparison) for which the placement of the headphones and the instruction of the subjects had to be carried out by a trained person.  A further drawback of the study concerned the sampling method.  Gender and age range are well-known factors that influence hearing thresholds.  To avoid biases due to an over-proportionate representation of these specific attributes, these investigators used a sampling method, which allowed to collect data from a more representative sample than simple random sampling.  This method accepted the consequence that the recruiting was not completely random.  The proportion of men and women in the sample was balanced, and age was uniformly sampled across the age range of the study.  In addition, 2 authors of the current article were involved in the development of the app-based mobile hearing test that was evaluated in this study.  This fact was disclosed before the start of the study so that a potential influence on the design of the study, data collection or the rational of data analysis could be contained beforehand.

Charih and associates (2020) stated that recent mobile and automated audiometry technologies have allowed for the democratization of hearing healthcare and enables non-experts to deliver hearing tests.  The problem remains that a large number of such users are not trained to interpret audiograms.  These investigators outlined the development of a data-driven audiogram classification system designed specifically for the purpose of concisely describing audiograms.  More specifically, they presented how a training data-set was assembled and the development of the classification system leveraging supervised learning techniques.  These researchers showed that 3 practicing audiologists had high intra- and inter-rater agreement over audiogram classification tasks pertaining to audiogram configuration, symmetry and severity.  The system proposed here achieved a performance comparable to the state of the art, but is significantly more flexible.  The authors concluded that this work laid a solid foundation for future work aiming to apply machine learning techniques to audiology for audiogram interpretation.

The authors stated that this study had several drawbacks.  Due to the logistical complexity and cost of acquiring audiogram annotations, these investigators were only able to assemble a data-set of 270 distinct audiograms annotated by 3 separate audiologists.  While these researchers did ensure that their audiologists were trained in different schools of audiology and practiced audiology with different subpopulations, it was likely that their estimate of inter-rater reliability could be made more accurate by adding additional raters.  In fact, hiring more audiologists and collecting more audiograms would likely further increase our confidence that these findings could be generalized.  Specifically, adding more raters is likely to increase inter-rater reliability (but not intra-rater reliability, which is reflection of the inherent difficulty of the task).  Unfortunately, augmenting their data-set is extremely costly, as the professional services of multiple audiologists are needed.  If large public data-sets, such as the NHANES, were to include diagnostic outcome, then this would enable larger-scale studies in the future.  A second main drawback worth mentioning was that the classification system presented here could not classify audiograms by site of lesion, while AMCLASS can.  Obtaining labels for this descriptor of hearing loss was impossible because the unlabeled NHANES data used in this study did not contain masked or unmasked bone conduction thresholds.  Finally, while a step in the right direction, the NHANES data-set used in this study did not comprise the data needed to extend the algorithm such that it could identify a potential diagnosis or the appropriate professional to whom the patient should be referred.  These researchers stated that future work will aim to collect more data and to examine the integration of additional sources of data such as medical history, patient age, bone conduction thresholds, questionnaire data, otoscopic images, and tympanogram data.  The ultimate objective is to extend the scope of this system, such that it not only describes the audiogram, but also provides a proposed differential diagnosis.  Additionally, the system could eventually provide recommendations with respect to referral and therapeutic options.  Another avenue involves assessing the generalizability of this system, although this will entail labeling additional audiograms to validate the Data-Driven Annotation Engine (DDAE) against.  Finally, when undertaking this project, these investigators sought to examine if machine learning can accomplish the same audiogram classification tasks normally completed by a professional audiologist.  Future studies should examine additional novel applications of machine learning in the field of audiology, beyond automating the state of the art.  However, adoption of such innovations may require a change in the practice of audiology itself and are beyond the scope of the present study.

Table: CPT Codes / HCPCS Codes / ICD-10 Codes
Code Code Description

Information in the [brackets] below has been added for clarification purposes.   Codes requiring a 7th character are represented by "+":

CPT codes not covered for indications listed in the CPB:

0208T Pure tone audiometry (threshold), automated; air only [without an audiologist]
0209T     air and bone [without an audiologist]

The above policy is based on the following references:

  1. American Speech-Language-Hearing Association (ASHA). Hearing screening and testing. Information for the Public. Rockville, MD: ASHA; 2013.
  2. Barker F, Mackenzie E, Elliott L, et al. Interventions to improve hearing aid use in adult auditory rehabilitation. Cochrane Database Syst Rev. 2014;7:CD010342
  3. Brennan-Jones CG, Eikelboom RH, Bennett RJ, et al. Asynchronous interpretation of manual and automated audiometry: Agreement and reliability. J Telemed Telecare. 2018;24(1):37-43.
  4. Brennan-Jones CG, Eikelboom RH, Swanepoel de W, et al. Clinical validation of automated audiometry with continuous noise-monitoring in a clinically heterogeneous population outside a sound-treated environment. Int J Audiol. 2016;55(9):507-513.
  5. Brennan-Jones CG, Eikelboom RH, Swanepoel W. Diagnosis of hearing loss using automated audiometry in an asynchronous telehealth model: A pilot accuracy study. J Telemed Telecare. 2017;23(2):256-262.
  6. Charih F, Bromwich M, Mark AE, et al. Data-driven audiogram classification for mobile audiometry. Sci Rep. 2020;10(1):3962.
  7. Colsman A, Supp GG, Neumann J, Schneider TR. Evaluation of accuracy and reliability of a mobile screening audiometer in normal hearing adults. Front Psychol. 2020;11:744.
  8. Convery E, Keidser G, Seeto M, et al. Factors affecting reliability and validity of self-directed automatic in situ audiometry: Implications for self-fitting hearing AIDS. J Am Acad Audiol. 2015;26(1):5-18.
  9. Foulad A, Bui P, Djalilian H. Automated audiometry using apple iOS-based application technology. Otolaryngol Head Neck Surg. 2013;149(5):700-706.
  10. Govender SM, Mars M. Assessing the efficacy of asynchronous telehealth-based hearing screening and diagnostic services using automated audiometry in a rural South African school. S Afr J Commun Disord. 2018;65(1):e1-e9.
  11. Ho AT, Hildreth AJ, Lindsey L. Computer-assisted audiometry versus manual audiometry. Otol Neurotol. 2009;30(7):876-883.
  12. Khoza-Shangase K, Kassner L. Automated screening audiometry in the digital age: Exploring Uhear and its use in a resource-stricken developing country. Int J Technol Assess Health Care. 2013;29(1):42-47.
  13. Levit Y, Himmelfarb M, Dollberg S. Sensitivity of the automated auditory brainstem response in neonatal hearing screening. . Pediatrics. 2015;136(3):e641-e647.
  14. Mahomed F, Swanepoel DW, Eikelboom RH, Soer M. Validity of automated threshold audiometry: A systematic review and meta-analysis. Ear Hear. 2013;34(6):745-752.
  15. Margolis RH, Glasberg BR, Creeke S, Moore BC. AMTAS: Automated method for testing auditory sensitivity: Validation studies. Int J Audiol. 2010;49(3):185-194.
  16. Masalski M, Kipinski L, Grysinski T, Krecicki T. Hearing tests on mobile devices: Evaluation of the reference sound level by means of biological calibration. J Med Internet Res. 2016;18(5):e130
  17. Pereira O, Pasko LE, Supinski J, et al. Is there a clinical application for tablet-based automated audiometry in children? Int J Pediatr Otorhinolaryngol. 2018;110:87-92.
  18. Saliba J, Al-Reefi M, Carriere JS, et al. Accuracy of mobile-based audiometry in the evaluation of hearing loss in quiet and noisy environments. Otolaryngol Head Neck Surg. 2017;156(4):706-711.
  19. Samelli AG, Rabelo CM, Sanches SGG, et al. Tablet-based tele-audiometry: Automated hearing screening for schoolchildren. J Telemed Telecare. 2020;26(3):140-149. 
  20. Shojaeemend H, Ayatollahi H. Automated audiometry: A review of the implementation and evaluation methods. Healthc Inform Res. 2018;24(4):263-275.
  21. Smith RJH, Gooi A. Hearing impairment in children: Evaluation. UpToDate [online serial]. Waltham, MA: UpToDate; reviewed July 2017.
  22. Swanepoel de W, Mngemane S, Molemong S, et al. Hearing assessment-reliability, accuracy, and efficiency of automated audiometry. Telemed J E Health. 2010;16(5):557-563.
  23. Weber PC. Evaluation of hearing loss in adults. UpToDate [online serial]. Waltham, MA: UpToDate; reviewed July 2017.
  24. Whitton JP, Hancock KE, Shannon JM, Polley DB. Validation of a self-administered audiometry application: An equivalence study. Laryngoscope. 2016;126(10):2382-2388.
  25. Yu J, Ostevik A, Hodgetts B, Ho A. Automated hearing tests: Applying the otogram to patients who are difficult to test. J Otolaryngol Head Neck Surg. 2011;40(5):376-383.