-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexiconfree beam search #101
base: lattice_traces
Are you sure you want to change the base?
Conversation
Two general points:
|
With this PR alone, the search algorithm is not usable yet. I will make PR's for an Flf node and python bindings separately. But I can include the
Yeah, probably. Maybe even into |
src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.hh
Outdated
Show resolved
Hide resolved
src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.cc
Outdated
Show resolved
Hide resolved
src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.cc
Outdated
Show resolved
Hide resolved
src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.cc
Show resolved
Hide resolved
src/Search/LexiconfreeTimesyncBeamSearch/LexiconfreeTimesyncBeamSearch.hh
Outdated
Show resolved
Hide resolved
…lattice from beam
static const Core::ParameterBool paramUseSentenceEnd; | ||
static const Core::ParameterBool paramSentenceEndIndex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are currently not used. I don't know if you want keep them if at some point you introduce sentence-end handling or if you want to remove them for now.
Simple time synchronous beam search algorithm based on the new
SearchAlgorithmV2
interface. Does not use (proper) pronunciation lexicon, word-level LM or transition model. Performs special handling of blank if a blank index is set. Main purpose is open vocabulary search with CTC/Neural Transducer (or similar) models.Supports global pruning by max beam-size and by score difference to the best hypothesis. Uses a LabelScorer to context initialization/extension and scoring.
The search requires a lexicon that represents the vocabulary. Each lemma is viewed as a token with its index in the lexicon corresponding to the associated output index of the LabelScorer.
Depends on #103 and #104.