subaligner.embedder module¶
-
class
subaligner.embedder.
FeatureEmbedder
(n_mfcc: int = 13, frequency: int = 16000, hop_len: int = 512, step_sample: float = 0.04, len_sample: float = 0.075)[source]¶ Bases:
object
Audio and subtitle feature embedding.
-
duration_to_position
(seconds: float) → int[source]¶ Return the cell position from a time in seconds.
- Parameters
{float} -- The duration in seconds. (seconds) –
- Returns
int – The cell position.
-
extract_data_and_label_from_audio
(audio_file_path: str, subtitle_file_path: Optional[str, None], subtitles: Optional[pysrt.SubRipFile, None] = None, sound_effect_start_marker: Optional[str, None] = None, sound_effect_end_marker: Optional[str, None] = None) → Tuple[numpy.ndarray, numpy.ndarray][source]¶ Generate a train dataset from an audio file and its subtitles
- Parameters
{string} -- The path to the audio file. (audio_file_path) –
{string} -- The path to the subtitle file. (subtitle_file_path) –
- Keyword Arguments
{pysrt.SubRipFile} -- The SubRipFile object (default (subtitles) – {None}).
sound_effect_start_marker – {string} – A string indicating the start of the ignored sound effect (default: {None}).
sound_effect_end_marker – {string} – A string indicating the end of the ignored sound effect (default: {None}).
- Returns
tuple – The training data and the training lables.
-
property
frequency
¶ Get the sample rate.
- Returns
int – The sample rate.
-
get_len_mfcc
() → float[source]¶ Get the number of samples to get LEN_SAMPLE: LEN_SAMPLE/(HOP_LEN/FREQUENCY).
- Returns
float – The number of samples.
-
get_step_mfcc
() → float[source]¶ Get the number of samples to get STEP_SAMPLE: STEP_SAMPLE/(HOP_LEN/FREQUENCY).
- Returns
float – The number of samples.
-
property
hop_len
¶ Get the number of samples per frame.
- Returns
int – The number of samples per frame.
-
property
len_sample
¶ Get the length in seconds for the input samples.
- Returns
float – The length in seconds for the input samples.
-
property
n_mfcc
¶ Get the number of MFCC components.
- Returns
int – The number of MFCC components.
-
position_to_duration
(position: int) → float[source]¶ Return the time in seconds from a cell position.
- Parameters
{int} -- The cell position. (position) –
- Returns
float – The number of seconds.
-
position_to_time_str
(position: int) → str[source]¶ Return the time string from a cell position.
- Parameters
{int} -- The cell position. (position) –
- Returns
23:20,150).
- Return type
string – The time string (e.g., 01
-
property
step_sample
¶ The space (in seconds) between the begining of each sample.
- Returns
float – The space (in seconds) between the begining of each sample.
-