Skip to main content

Table 3 Description of the top 12 vocal features

From: Differentiation between depression and bipolar disorder in child and adolescents by voice features

 

Name of features

Meaning of features

− log (P)

1

pcm_RMSenergy_sma_rqmean

Represents the mean value of the root mean square (RMS) energy of a speech file

314.97

2

pcm_fftMag_spectralSlope_sma_amean

Represents the mean value of the slope of the power spectrum of the speech signal

303.08

3

audSpec_Rfilt_sma[1]_percentile1.0

Represents the energy magnitude of the signal in the 301–600 Hz frequency band in hundredths of a decimal place

303.01

4

pcm_fftMag_spectralHarmonicity_sma_amean

Represents the average value of the spectral harmonicity of the audio signal

297.07

5

pcm_fftMag_spectralFlux_sma_percentile1.0

Represents the 1.0 percentile of the spectral flux of the audio signal

294.77

6

pcm_fftMag_fband250-650_sma_percentile1.0

Represents the 1.0 percentile of the spectral amplitude of the audio signal in the frequency band 250 Hz to 650 Hz

294.17

7

pcm_fftMag_fband1000-4000_sma_percentile1.0

Represents the 1.0 percentile of the spectral amplitude of the audio signal in the frequency band 1000 Hz to 4000 Hz

289.21

8

audspec_lengthL1norm_sma_amean

Represents the average of the L1 parametres of the spectral length of the audio signal

288.56

9

pcm_RMSenergy_sma_percentile1.0

Represents the 1.0 percentile of the root-mean-square energy of the audio signal

288.3

10

pcm_fftMag_fband250-650_sma_de_quartile2

Represents the dichotomous difference of the spectral amplitude of the audio signal in the frequency band 250 Hz to 650 Hz

280.2

11

pcm_fftMag_fband250-650_sma_amean

Represents the average value of the spectral amplitude of the audio signal in the frequency band 250 Hz to 650 Hz

279.83

12

audspec_lengthL1norm_sma_percentile1.0

Represents the 1.0 percentile of the L1 parity of the spectral length of the audio signal

279.23