subaligner.embedder module

class subaligner.embedder.FeatureEmbedder(n_mfcc: int = 13, frequency: int = 16000, hop_len: int = 512, step_sample: float = 0.04, len_sample: float = 0.075)[source]

Bases: object

Audio and subtitle feature embedding.

duration_to_position(seconds: float) → int[source]

Return the cell position from a time in seconds.

Parameters

{float} -- The duration in seconds. (seconds) –

Returns

int – The cell position.

extract_data_and_label_from_audio(audio_file_path: str, subtitle_file_path: Optional[str, None], subtitles: Optional[pysrt.SubRipFile, None] = None, sound_effect_start_marker: Optional[str, None] = None, sound_effect_end_marker: Optional[str, None] = None) → Tuple[numpy.ndarray, numpy.ndarray][source]

Generate a train dataset from an audio file and its subtitles

Parameters
  • {string} -- The path to the audio file. (audio_file_path) –

  • {string} -- The path to the subtitle file. (subtitle_file_path) –

Keyword Arguments
  • {pysrt.SubRipFile} -- The SubRipFile object (default (subtitles) – {None}).

  • sound_effect_start_marker – {string} – A string indicating the start of the ignored sound effect (default: {None}).

  • sound_effect_end_marker – {string} – A string indicating the end of the ignored sound effect (default: {None}).

Returns

tuple – The training data and the training lables.

property frequency

Get the sample rate.

Returns

int – The sample rate.

get_len_mfcc() → float[source]

Get the number of samples to get LEN_SAMPLE: LEN_SAMPLE/(HOP_LEN/FREQUENCY).

Returns

float – The number of samples.

get_step_mfcc() → float[source]

Get the number of samples to get STEP_SAMPLE: STEP_SAMPLE/(HOP_LEN/FREQUENCY).

Returns

float – The number of samples.

property hop_len

Get the number of samples per frame.

Returns

int – The number of samples per frame.

property len_sample

Get the length in seconds for the input samples.

Returns

float – The length in seconds for the input samples.

property n_mfcc

Get the number of MFCC components.

Returns

int – The number of MFCC components.

position_to_duration(position: int) → float[source]

Return the time in seconds from a cell position.

Parameters

{int} -- The cell position. (position) –

Returns

float – The number of seconds.

position_to_time_str(position: int) → str[source]

Return the time string from a cell position.

Parameters

{int} -- The cell position. (position) –

Returns

23:20,150).

Return type

string – The time string (e.g., 01

property step_sample

The space (in seconds) between the begining of each sample.

Returns

float – The space (in seconds) between the begining of each sample.

time_to_position(pysrt_time: pysrt.SubRipTime) → int[source]

Return a cell position from timestamp.

Parameters

{pysrt.SubRipTime} -- SubRipTime or coercible. (pysrt_time) –

Returns

int – The cell position.

classmethod time_to_sec(pysrt_time: pysrt.SubRipTime) → float[source]

Convert timestamp to seconds.

Parameters

{pysrt.SubRipTime} -- SubRipTime or coercible. (pysrt_time) –

Returns

float – The number of seconds.