-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaletempo2, the new default for adjusting audio playback speed, sounds noticeably worse in some situations #8705
Comments
I suggest voting with +1 and -1 on the original post to vote on changing the default. Edit: yes, can reproduce |
Suggestion: scaletempo3, which uses scaletempo for speeds between 0.80 to 1.3 and scaletempo2 for speeds outside of that range. |
Or maybe have it configurable separately as we have it with scale algorithms. |
The best solution would just be to make scaletempo2 work better even at minor speed changes! Chrome's own audio scaling, which scaletempo2 is a port of, seems to work fine with minor adjustments. @DorianRudolph, perhaps you might have an idea of why scaletempo2 performs worse than Chrome does at 0.95x speed? Is there any hope for just tweaking it to handle this use case? That would of course be the ideal solution! My thinking is that if scaletempo is going to be restored as the default, that should happen soon to avoid further confusion for people. Also I'm basing this on the assuming that minor speed changes in the 0.85x - 1.2x range are far more common amongst MPV users, like they are for me, though I'm not sure if that's true. No matter the outcome, I'll also submit a PR for a docs update which adds a section explaining to users how to easily change the default to another audio scaler. (@TiGR I do think the how mpv lets you choose your "audio scaling" filter is a bit idiosyncratic and hard to discover, but I think that's for a different discussion!) |
If it works fine in Chrome it may be because some versions of their code switched to resampling between speed 0.95 - 1.06. Personally I prefer to use resampling when close to normal speed, like when playing a 25 Hz movie on a 24 Hz display. And mpv can do sync to vsync with resampling and be configured to do that so 25 Hz movies are automatically resampled to 24 Hz. As I have started working on some fixes to scaletempo2 (not related to speed near 1) it would be good to quickly decide which scaletempo version to use (there is one more atempo in ffmpeg) as maintaining several WSOLA implementations will just be confusing for users and additional work for maintainers. But may be needed if one cannot solve all users needs. |
When I built mpv from git, the first thing I noticed were rather severe audio glitches when listening to audiobooks at speeds 0.9 and 1.1 (depending on whether the narrator is too fast or too slow.) It sounds like a scratched CD where the CD player is skipping.
There doesn't seem to be an option to tell mpv which filter to use, so I had to put Something like an |
|
I can't see what speed I'm setting. |
@realnc please file a new issue, with logs and everything else which the template requests If you can bisect it to find the exact first commit where the issue happens - it would great info to add. |
@realnc could you please open a new issue for this? All the reports we have so far are about subjective quality, but what you're describing is new, and could very well be an actual bug - which none of us is able to reproduce. So please file a new issue, with logs, preferably sample files, bisect if you can, etc. It would help us identify a yet-unknown bug. |
I will be not too helpful commenting here, but I just want to confirm this report. I upgraded do 0.34 and wondered why voices sound robotic at speed 1.1 until I figured out that apparrently the default was changed to scaletempo2. I added af=scaletempo as option in mpv 0.34 and aparrently things went back to normal. Unfortunately, I can't offer any samples and it may subjective, but to me it was clear as day that something had changed and voices sounded very robotic with a lot of videos (but not all!). There seem to be some exceptions, but for me, scaletempo2 is way worse. At least please don't remove scaletempo, for me scaletempo2 is very hard to bear for many files. I can try to see if I notice some kind of regularity such as audio codecs, but for me, there is something very wrong with scaletempo2. |
Use atempo instead. |
I tried atempo. It sounds similar to scaletempo2 to me (i.e. robotic). It is also not documented in the mpv manual, so I did not get the idea to use this ffmpeg filter. For me scaletempo sounds best. Is it possible that there is some kind of bug in mpv that makes scaletempo2 or atempo sound much worse for only some people? |
@kevin-stuart I don't think there's any reason why the exact same media played with the exact same version of MPV would result in any difference in sound between people. That said, I opened this issue because I observed that 6 channel audio with Also I agree that Given how long (I use |
You are right, I observed my problems with scaletempo2 with 6 channel audio. I mainly use 1.1 as speedup and scaletempo2 and atempo sound bad for me with this setup. I have set scaletempo in my config. I just hope that scaletempo2 is improved in the future and that scaletempo is not removed until then. For me, scaletempo2 became the new default only very recently when I upgraded to 0.34 |
I also noticed occasionally very bad sound with scaletempo2. |
You might want to try out |
It sounds horrible to me with speech with a speed of 0.94. Some words sound robotic, metallic and choppy.As a quick test, I was listening to this podcast: https://www.youtube.com/watch?v=cnFubyqJ3Ro Prime example is at the very beginning (0:0:45s) where he says "that the community left for us". If you set the speed to 0.94, scaletempo2 is attrocious. scaletempo is perfect. Whether I use your paremeters or not doesn't change anything for me in this regard. |
Admittedly I never actually listen to anything at <1 speed, so maybe I would have noticed something at some point if I did. |
But it sounds bad on that last sample at 0.94, at least the "basically we" part. (I also mostly play media at faster speeds and I would guess that's true for most people too.) |
Interestingly:
|
Is this issue still valid on builds from current master? Also please try |
The signal energy was used for the similarity calculation in the search for the optimal overlap position. It's usage led to worse results with increased channel count. I was not able to find a situation where the inclusion of signal energy produced better results then without it. Without signal energy this effectively turns into a cross-correlation. Fixes mpv-player#8705 (comment)
The signal energy was used for the similarity calculation in the search for the optimal overlap position. It's usage led to worse results with increased channel count. I was not able to find a situation where the inclusion of signal energy produced better results then without it. Without signal energy this effectively turns into a cross-correlation. Fixes mpv-player#8705 (comment)
The old formula worked well for stereo, but the results got worse with increased channel count. The taxicab distance works just as well for stereo, while not falling appart as the channel count grows. The downside is increased CPU usage. Maybe someone can try and vectorize this one like the old one was. The performance still isn't bad, so there is no pressing need for it. Fixes mpv-player#8705 (comment)
The old formula worked well for stereo, but the results got worse with increased channel count. The taxicab distance works just as well for stereo, while not falling appart as the channel count grows. The downside is increased CPU usage. Maybe someone can try and vectorize this one like the old one was. The performance still isn't bad, so there is no pressing need for it. Fixes mpv-player#8705 (comment)
The old formula worked well for stereo, but the results got worse with increased channel count. The taxicab distance works just as well for stereo, while not falling appart as the channel count grows. The downside is increased CPU usage. Maybe someone can try and vectorize this one like the old one was. The performance still isn't bad, so there is no pressing need for it. Fixes mpv-player#8705 (comment)
The old formula worked well for stereo, but the results got worse with increased channel count. The taxicab distance works just as well for stereo, while not falling appart as the channel count grows. The downside is increased CPU usage. Maybe someone can try and vectorize this one like the old one was. The performance still isn't bad, so there is no pressing need for it. Fixes mpv-player#8705 (comment)
Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's the target block energy is factored into the usual similarity measure. To prevent bias towards louder blocks, the result is divided by the total energy across all channels. This should have very little effect on very correlated channels (such as most stereo media), as the division by total energy reverses the effect of the channel-wise factorization if all channels have similar energy. See-Also: mpv-player#8705 See-Also: mpv-player#13737
Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have very little effect on very correlated channels (such as most stereo media), as the factors are very similar for all channels. See-Also: mpv-player#8705 See-Also: mpv-player#13737
Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have little effect on very correlated channels (such as most stereo media), where the factors are very similar for all channels. See-Also: mpv-player#8705 See-Also: mpv-player#13737
I'm not convinced scaletempo2 is actually better than the original scaletempo at any speed. The problem is that mpv's default parameters for scaletempo gives suboptimal results, so when comparing each filter at default settings, scaletempo2 comes out ahead. But properly configured, scaletempo still beats scaletempo2. I have Meanwhile, for scaletempo2, no combination of parameters can guarantee artifact-free audio at any speed. And the artifacts can actually be quite severe. scaletempo2 has audible pitch shifting of as much as a semitone on drone notes in the background music, which sounds like wrong notes being played, which is really distracting. The subjective quality improvement at higher speeds is simply due to the artifacts being harder to hear since they go by so quickly, but they're still there. For speeds < 1, scaletempo2 is sounds similar to scaletempo, but WSOLA-type algorithms are all a bit of a crapshoot. FFT methods are better for that IMO. |
@mesvam here is a little excerpt from a song with your scaletempo parameters |
@christoph-heinrich ok I stand corrected. That timber was worse than I expected. I will say though, that even in that worst case, it's still better than when scaletempo2 goes wonky. Here is an example of background music going crazy with scaletempo2 What's worse is that the artifacts in your excerpt is mainly due to the bass frequencies, which can be fixed by increasing stride/search to 30 or higher |
Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have little effect on very correlated channels (such as most stereo media), where the factors are very similar for all channels. See-Also: #8705 See-Also: #13737
Well #13748 improved this but I don't think it's necessarily fixed judging by the comments so reopening. |
Is there any way to fix desync of atempo? because its still best one imo. |
not really, atempo filter changes timestamps and that causes desync, workaround is adding some hack which would rescale those timestamps back to original values that mpv expects. |
what do you mean by hacks like this? |
Yes, something like that hardcoded to keep A/V sync but that breaks seeking to right spot... |
I have developed prototype filter that can stretch audio with 2x factor, using autocorrelation by RDFT to find similar periods plus interpolating found periods with equal-power cross-fade that make use of normalization cross-correlation factor between two periods. The output is much better than scaletempo(2) or atempo. Need to do similar for 1/2 factor for 2x speed gain. |
Got 0.5 and 2.0 ratios working well and fast. Maybe will add support for arbitrary ratios. If anybody interested to take a look at it I can push filter into librempeg. |
Looks like nobody interested in high-quality, removed code, and moved on to do other stuff. |
Anybody interested in 0.5 and 2.0 fixed tempo filter try ascale filter. |
yes. especially for 2.0. |
I don't know if Paul provides any builds, but you can use, the ones from #14977. |
well this will work only if audio-only is used, otherwise runtime AV adjust within mpv is not possible, maybe one can hack some new filter into mpv, once i figure how to do arbitrary scaling, and fix not so small issues with >=2 channels and solo high volume bass-rich & treble-poor audio. |
Now all tempo values from 0.5 to 2.0 should be supported, it sound fine to me, still >=2 channels and heavy bass audio with < 1.0 tempo remain to be fixed. |
Important Information
Provide following Information:
This is relevant because
scaletempo2
was changed to the default fromscaletempo
in #8376Reproduction steps
Try listening to some 5.1 audio at 0.95x speed using the now default scaletempo2 filter.
$ mpv --speed=0.95 --af=scaletempo2 some_audio
Listen for the poor quality in some situations.
Now add the old default,
scaletempo
, to the af filter chain and listen for the better quality.$ mpv --speed=0.95 --af=scaletempo some_audio
Also listen to the recorded sample files below. You can use the original_source.mkv file included to reproduce the samples I recorded.
Expected behavior
The default should not make things worse.
Actual behavior
scaletempo2 is worse for minor speed changes in the 0.85x - 1.2x range. It's MUCH better for the big speed changes though, and I really appreciate it for that.
While I do appreciate scaletempo2 for big adjustments, I usually only make minor speed changes so for me so it's not a good default. I suspect that playback speed adjustments in the 0.9-1.2x range are much more common amongst users. This comment is where another users seems to have been caught up in this default change.
Sample files
The text was updated successfully, but these errors were encountered: