Scaletempo2, the new default for adjusting audio playback speed, sounds noticeably worse in some situations #8705

varenc · 2021-04-07T22:33:33Z

Important Information

Provide following Information:

mpv version: freshly compiled latest from master (mpv 0.33.0-109-gd0c530919d)
macOS

This is relevant because scaletempo2 was changed to the default from scaletempo in #8376

Reproduction steps

Try listening to some 5.1 audio at 0.95x speed using the now default scaletempo2 filter.
$ mpv --speed=0.95 --af=scaletempo2 some_audio
Listen for the poor quality in some situations.

Now add the old default, scaletempo, to the af filter chain and listen for the better quality.
$ mpv --speed=0.95 --af=scaletempo some_audio

Also listen to the recorded sample files below. You can use the original_source.mkv file included to reproduce the samples I recorded.

Expected behavior

The default should not make things worse.

Actual behavior

scaletempo2 is worse for minor speed changes in the 0.85x - 1.2x range. It's MUCH better for the big speed changes though, and I really appreciate it for that.

While I do appreciate scaletempo2 for big adjustments, I usually only make minor speed changes so for me so it's not a good default. I suspect that playback speed adjustments in the 0.9-1.2x range are much more common amongst users. This comment is where another users seems to have been caught up in this default change.

Sample files

scaletempo2_0.95x_speed.wav (Bad)
scaletempo_0.95x_speed.wav (Better)
1x_speed.wav (1x speed, recorded in the same way)
original_source_audio.mkv

The text was updated successfully, but these errors were encountered:

Hrxn · 2021-04-07T22:52:25Z

I suggest voting with +1 and -1 on the original post to vote on changing the default.
+1 to vote for changing the default back to scaletempo (unless there is some fix)
-1 to vote for keeping the current default.

Edit: yes, can reproduce

CounterPillow · 2021-04-07T23:23:42Z

Suggestion: scaletempo3, which uses scaletempo for speeds between 0.80 to 1.3 and scaletempo2 for speeds outside of that range.

TiGR · 2021-04-08T07:35:37Z

Or maybe have it configurable separately as we have it with scale algorithms.

varenc · 2021-04-10T23:36:05Z

The best solution would just be to make scaletempo2 work better even at minor speed changes! Chrome's own audio scaling, which scaletempo2 is a port of, seems to work fine with minor adjustments.

@DorianRudolph, perhaps you might have an idea of why scaletempo2 performs worse than Chrome does at 0.95x speed? Is there any hope for just tweaking it to handle this use case? That would of course be the ideal solution!

My thinking is that if scaletempo is going to be restored as the default, that should happen soon to avoid further confusion for people. Also I'm basing this on the assuming that minor speed changes in the 0.85x - 1.2x range are far more common amongst MPV users, like they are for me, though I'm not sure if that's true. No matter the outcome, I'll also submit a PR for a docs update which adds a section explaining to users how to easily change the default to another audio scaler.

(@TiGR I do think the how mpv lets you choose your "audio scaling" filter is a bit idiosyncratic and hard to discover, but I think that's for a different discussion!)

DanOscarsson · 2021-04-11T09:47:34Z

If it works fine in Chrome it may be because some versions of their code switched to resampling between speed 0.95 - 1.06. Personally I prefer to use resampling when close to normal speed, like when playing a 25 Hz movie on a 24 Hz display. And mpv can do sync to vsync with resampling and be configured to do that so 25 Hz movies are automatically resampled to 24 Hz.
My only need for preserving pitch is when playing at a fast speed like > 1.5.
And that is what I would have expected most users need scaletempo2/scaletempo for.
But apparently that may not be true but cannot be determined without asking a lot of users.

As I have started working on some fixes to scaletempo2 (not related to speed near 1) it would be good to quickly decide which scaletempo version to use (there is one more atempo in ffmpeg) as maintaining several WSOLA implementations will just be confusing for users and additional work for maintainers. But may be needed if one cannot solve all users needs.

realnc · 2021-05-29T12:18:49Z

When I built mpv from git, the first thing I noticed were rather severe audio glitches when listening to audiobooks at speeds 0.9 and 1.1 (depending on whether the narrator is too fast or too slow.) It sounds like a scratched CD where the CD player is skipping.

scaletempo produces perfect results at these playback speeds. You can't even tell the sound is slowed down or sped up. It really sounds like the narrator is just reading slower or faster.

There doesn't seem to be an option to tell mpv which filter to use, so I had to put af-add=scaletempo in my config. Unfortunately, this disabled mpv's automatic filter removal when the filter is not needed. The filter is always active and shows up in the OSD all the time.

Something like an --audio-speed-filter option would be very nice to have instead of hardcoding scaletempo2 in the mpv source code.

garoto · 2021-05-29T13:57:54Z

[	 no-osd af add "@tempo:scaletempo" ; no-osd add speed "-0.1"
]	 no-osd af add "@tempo:scaletempo" ; no-osd add speed "+0.1"
BS	 no-osd af remove @tempo ; no-osd set speed 1.0

realnc · 2021-05-29T15:32:55Z

[	 no-osd af add "@tempo:scaletempo" ; no-osd add speed "-0.1"
]	 no-osd af add "@tempo:scaletempo" ; no-osd add speed "+0.1"
BS	 no-osd af remove @tempo ; no-osd set speed 1.0

I can't see what speed I'm setting.

avih · 2021-05-29T16:41:52Z

@realnc please file a new issue, with logs and everything else which the template requests

If you can bisect it to find the exact first commit where the issue happens - it would great info to add.

avih · 2021-05-31T10:49:02Z

It sounds like a scratched CD where the CD player is skipping

@realnc could you please open a new issue for this? All the reports we have so far are about subjective quality, but what you're describing is new, and could very well be an actual bug - which none of us is able to reproduce.

So please file a new issue, with logs, preferably sample files, bisect if you can, etc. It would help us identify a yet-unknown bug.

kevin-stuart · 2021-11-14T14:20:36Z

I will be not too helpful commenting here, but I just want to confirm this report.

I upgraded do 0.34 and wondered why voices sound robotic at speed 1.1 until I figured out that apparrently the default was changed to scaletempo2. I added af=scaletempo as option in mpv 0.34 and aparrently things went back to normal.

Unfortunately, I can't offer any samples and it may subjective, but to me it was clear as day that something had changed and voices sounded very robotic with a lot of videos (but not all!). There seem to be some exceptions, but for me, scaletempo2 is way worse.

At least please don't remove scaletempo, for me scaletempo2 is very hard to bear for many files. I can try to see if I notice some kind of regularity such as audio codecs, but for me, there is something very wrong with scaletempo2.

richardpl · 2021-11-14T16:14:41Z

Use atempo instead.

kevin-stuart · 2021-11-17T09:40:02Z

I tried atempo. It sounds similar to scaletempo2 to me (i.e. robotic). It is also not documented in the mpv manual, so I did not get the idea to use this ffmpeg filter. For me scaletempo sounds best. Is it possible that there is some kind of bug in mpv that makes scaletempo2 or atempo sound much worse for only some people?

varenc · 2021-11-27T22:12:01Z

@kevin-stuart I don't think there's any reason why the exact same media played with the exact same version of MPV would result in any difference in sound between people. That said, I opened this issue because I observed that 6 channel audio with scaletempo2 seemed to give worse results than scaletempo when there's a very minor speed adjustment. But the issue went away with most stereo audio. I suspect you're experiencing the same issue. If you can post a small sample that'll help people confirm.

Also I agree that atempo also performs well, but atempo isn't fully supported by mpv and it will eventually lead to an out of sync audio and video. But if you're just playing audio you might not care. I described the atempo issue and some very janky workarounds here: #4418 (comment) For me, scaletempo2 removes my need for atempo.

Given how long scaletempo2 has been the default at this point, unless a lot more people find this issue and concur, I think leaving it the default will be the least disruptive for the most folks. In the meantime just making it easy to switch back to scaletempo is an easy solution. Maybe adding that to the default input.conf to could help. (though tough to decide on the key)

(I use $ af toggle scaletempo in my input.conf to make the $ key toggle it)

kevin-stuart · 2021-11-27T23:03:28Z

You are right, I observed my problems with scaletempo2 with 6 channel audio. I mainly use 1.1 as speedup and scaletempo2 and atempo sound bad for me with this setup. I have set scaletempo in my config. I just hope that scaletempo2 is improved in the future and that scaletempo is not removed until then. For me, scaletempo2 became the new default only very recently when I upgraded to 0.34

dardoor · 2022-01-09T00:11:47Z

I also noticed occasionally very bad sound with scaletempo2.
Here's an example from a movie with 2 channel audio, comparing scaletempo and scaletempo2 at 1.1x and 1.21x speeds:
scaletempo mpv test.zip

christoph-heinrich · 2022-08-05T01:22:32Z

You might want to try out --af=scaletempo2=search-interval=50:window-size=40.
I've tried the example from @dardoor (original (1x).opus) and it sounds great at various speeds (>1).

realnc · 2022-08-05T07:37:00Z

You might want to try out --af=scaletempo2=search-interval=50:window-size=40. I've tried the example from @dardoor (original (1x).opus) and it sounds great at various speeds.

It sounds horrible to me with speech with a speed of 0.94. Some words sound robotic, metallic and choppy.As a quick test, I was listening to this podcast:

https://www.youtube.com/watch?v=cnFubyqJ3Ro

Prime example is at the very beginning (0:0:45s) where he says "that the community left for us". If you set the speed to 0.94, scaletempo2 is attrocious. scaletempo is perfect.

Whether I use your paremeters or not doesn't change anything for me in this regard.

christoph-heinrich · 2022-08-05T15:17:29Z

mpv --no-config --start=44 --speed=0.94 --af=<filter> 'https://www.youtube.com/watch?v=cnFubyqJ3Ro'
I don't hear a problem with scaletempo2, but maybe I'm so used to it that I don't even notice it anymore.
test.zip

Admittedly I never actually listen to anything at <1 speed, so maybe I would have noticed something at some point if I did.
(videos are always >=1.25 speed for me, but I also tested with smaller values >=1)

dardoor · 2022-10-09T20:15:16Z

scaletempo2=search-interval=50:window-size=40 does sound good on the sample I posted, at 1.1 and 1.2 speeds, even a bit better than scaletempo, I think.

But it sounds bad on that last sample at 0.94, at least the "basically we" part.
scaletempo2 with no parameters sounds better, and scaletempo even better.

(I also mostly play media at faster speeds and I would guess that's true for most people too.)

mars4science · 2023-07-24T08:33:43Z

Interestingly:
After I've changed scaletempo2 to scaletempo in f_auto_filters.c p->sub.filter = mp_create_user_filter(f, MP_OUTPUT_CHAIN_AUDIO, "scaletempo", NULL);

--af=scaletempo=speed= none, both and tempo sound about the same - like I expect tempo to sound. af=scaletempo=speed=pitch works as expected. But when I've commented out that line sound was played at 1x speed regardless of video speed.
Seems none and both values to option speed do not work as expected from man page.

    both
        Scale both tempo and pitch.
    none
        Ignore speed changes.

llyyr · 2023-09-25T17:11:13Z

Is this issue still valid on builds from current master? Also please try rubberband from #12479 build

The signal energy was used for the similarity calculation in the search for the optimal overlap position. It's usage led to worse results with increased channel count. I was not able to find a situation where the inclusion of signal energy produced better results then without it. Without signal energy this effectively turns into a cross-correlation. Fixes mpv-player#8705 (comment)

The old formula worked well for stereo, but the results got worse with increased channel count. The taxicab distance works just as well for stereo, while not falling appart as the channel count grows. The downside is increased CPU usage. Maybe someone can try and vectorize this one like the old one was. The performance still isn't bad, so there is no pressing need for it. Fixes mpv-player#8705 (comment)

Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's the target block energy is factored into the usual similarity measure. To prevent bias towards louder blocks, the result is divided by the total energy across all channels. This should have very little effect on very correlated channels (such as most stereo media), as the division by total energy reverses the effect of the channel-wise factorization if all channels have similar energy. See-Also: mpv-player#8705 See-Also: mpv-player#13737

Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have very little effect on very correlated channels (such as most stereo media), as the factors are very similar for all channels. See-Also: mpv-player#8705 See-Also: mpv-player#13737

Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have little effect on very correlated channels (such as most stereo media), where the factors are very similar for all channels. See-Also: mpv-player#8705 See-Also: mpv-player#13737

mesvam · 2024-04-01T23:59:02Z

I'm not convinced scaletempo2 is actually better than the original scaletempo at any speed. The problem is that mpv's default parameters for scaletempo gives suboptimal results, so when comparing each filter at default settings, scaletempo2 comes out ahead. But properly configured, scaletempo still beats scaletempo2. I have scaletempo=stride=15:overlap=1:search=15 and it gives nearly perfect playback quality from speeds 1 to 4, and I've never heard any artifacts on a variety of audio. CPU usage may be a bit higher with these settings, but at reasonable speeds on reasonably recent hardware, the load is negligible, especially compared to video decoding.

Meanwhile, for scaletempo2, no combination of parameters can guarantee artifact-free audio at any speed. And the artifacts can actually be quite severe. scaletempo2 has audible pitch shifting of as much as a semitone on drone notes in the background music, which sounds like wrong notes being played, which is really distracting. The subjective quality improvement at higher speeds is simply due to the artifacts being harder to hear since they go by so quickly, but they're still there.

For speeds < 1, scaletempo2 is sounds similar to scaletempo, but WSOLA-type algorithms are all a bit of a crapshoot. FFT methods are better for that IMO.

christoph-heinrich · 2024-04-02T02:11:22Z

@mesvam here is a little excerpt from a song with your scaletempo parameters
1.12x speed.webm
1x speed.webm

mesvam · 2024-04-02T03:31:15Z

@christoph-heinrich ok I stand corrected. That timber was worse than I expected.

I will say though, that even in that worst case, it's still better than when scaletempo2 goes wonky. Here is an example of background music going crazy with scaletempo2
excerpt.webm
excerpt-scaletempo2-1.06.webm

What's worse is that the artifacts in your excerpt is mainly due to the bass frequencies, which can be fixed by increasing stride/search to 30 or higher scaletempo=stride=30:overlap=1:search=30, with some sacrifices when it comes to other content. I could not find any settings for scaletempo2 that would make my audio listenable, and there aren't even heavy bass frequencies in there!

Playback with many audio channels could be distorted when using scaletempo2. This was most noticeable when there were a lot of quiet channels and few louder channels. Fix this by increasing the weight of louder channels in relation to quieter channels. Each channel's target block energy is factored into the usual similarity measure. This should have little effect on very correlated channels (such as most stereo media), where the factors are very similar for all channels. See-Also: #8705 See-Also: #13737

Dudemanguy · 2024-04-12T17:41:12Z

Well #13748 improved this but I don't think it's necessarily fixed judging by the comments so reopening.

fideliochan · 2024-06-29T16:26:24Z

Is there any way to fix desync of atempo? because its still best one imo.

richardpl · 2024-06-29T21:08:34Z

not really, atempo filter changes timestamps and that causes desync, workaround is adding some hack which would rescale those timestamps back to original values that mpv expects.

fideliochan · 2024-06-29T21:17:48Z

what do you mean by hacks like this?
-vf setpts='PTS/1.15' -af atempo=1.15

richardpl · 2024-06-29T21:23:01Z

Yes, something like that hardcoded to keep A/V sync but that breaks seeking to right spot...

richardpl · 2024-07-10T19:46:18Z

I have developed prototype filter that can stretch audio with 2x factor, using autocorrelation by RDFT to find similar periods plus interpolating found periods with equal-power cross-fade that make use of normalization cross-correlation factor between two periods. The output is much better than scaletempo(2) or atempo. Need to do similar for 1/2 factor for 2x speed gain.

richardpl · 2024-07-20T15:02:01Z

Got 0.5 and 2.0 ratios working well and fast. Maybe will add support for arbitrary ratios. If anybody interested to take a look at it I can push filter into librempeg.

richardpl · 2024-07-21T23:02:45Z

Looks like nobody interested in high-quality, removed code, and moved on to do other stuff.

richardpl · 2024-10-04T19:30:29Z

Anybody interested in 0.5 and 2.0 fixed tempo filter try ascale filter.

BergmannAtmet · 2024-10-04T20:22:20Z

Anybody interested in 0.5 and 2.0 fixed tempo filter try ascale filter.

yes. especially for 2.0.

kasper93 · 2024-10-04T20:33:29Z

I don't know if Paul provides any builds, but you can use, the ones from #14977.

richardpl · 2024-10-04T21:44:56Z

well this will work only if audio-only is used, otherwise runtime AV adjust within mpv is not possible, maybe one can hack some new filter into mpv, once i figure how to do arbitrary scaling, and fix not so small issues with >=2 channels and solo high volume bass-rich & treble-poor audio.

richardpl · 2024-10-05T12:39:22Z

Now all tempo values from 0.5 to 2.0 should be supported, it sound fine to me, still >=2 channels and heavy bass audio with < 1.0 tempo remain to be fixed.

Akemi added core:audio filter meta:rfc labels Apr 29, 2021

kevinlekiller mentioned this issue Jan 23, 2022

Autospeed causes audio to be distorted kevinlekiller/mpv_scripts#19

Closed

alexmercerind mentioned this issue Dec 20, 2022

Low quality bass harmonoid/harmonoid#368

Open

This was referenced Jul 24, 2023

Emptying audio filters seems to result in activation of default scaletempo2 #12006

Closed

There seems to be no way to use scaletempo with speed parameter (and scaletempo2 lacks similar one) #12007

Open

Dudemanguy mentioned this issue Sep 25, 2023

audio: allow specifying pitch correction filter #12479

Closed

ferreum mentioned this issue Mar 21, 2024

af_scaletempo2: improve signal similarity metric #13737

Closed

ferreum mentioned this issue Mar 22, 2024

af_scaletempo2: prioritize louder channels for similarity measure #13748

Merged

Dudemanguy closed this as completed in #13748 Apr 12, 2024

Dudemanguy reopened this Apr 12, 2024

Scaletempo2, the new default for adjusting audio playback speed, sounds noticeably worse in some situations #8705

Scaletempo2, the new default for adjusting audio playback speed, sounds noticeably worse in some situations #8705

Comments

varenc commented Apr 7, 2021 • edited Loading

Important Information

Reproduction steps

Expected behavior

Actual behavior

Sample files

Hrxn commented Apr 7, 2021 • edited Loading

CounterPillow commented Apr 7, 2021

TiGR commented Apr 8, 2021

varenc commented Apr 10, 2021

DanOscarsson commented Apr 11, 2021

realnc commented May 29, 2021

garoto commented May 29, 2021

realnc commented May 29, 2021

avih commented May 29, 2021

avih commented May 31, 2021

kevin-stuart commented Nov 14, 2021

richardpl commented Nov 14, 2021

kevin-stuart commented Nov 17, 2021

varenc commented Nov 27, 2021

kevin-stuart commented Nov 27, 2021 • edited Loading

dardoor commented Jan 9, 2022

christoph-heinrich commented Aug 5, 2022 • edited Loading

realnc commented Aug 5, 2022

christoph-heinrich commented Aug 5, 2022

dardoor commented Oct 9, 2022

mars4science commented Jul 24, 2023

llyyr commented Sep 25, 2023 • edited Loading

mesvam commented Apr 1, 2024 • edited Loading

christoph-heinrich commented Apr 2, 2024

mesvam commented Apr 2, 2024 • edited Loading

Dudemanguy commented Apr 12, 2024

fideliochan commented Jun 29, 2024

richardpl commented Jun 29, 2024

fideliochan commented Jun 29, 2024

richardpl commented Jun 29, 2024

richardpl commented Jul 10, 2024

richardpl commented Jul 20, 2024

richardpl commented Jul 21, 2024

richardpl commented Oct 4, 2024

BergmannAtmet commented Oct 4, 2024

kasper93 commented Oct 4, 2024 • edited Loading

richardpl commented Oct 4, 2024

richardpl commented Oct 5, 2024

varenc commented Apr 7, 2021 •

edited

Loading

Hrxn commented Apr 7, 2021 •

edited

Loading

kevin-stuart commented Nov 27, 2021 •

edited

Loading

christoph-heinrich commented Aug 5, 2022 •

edited

Loading

llyyr commented Sep 25, 2023 •

edited

Loading

mesvam commented Apr 1, 2024 •

edited

Loading

mesvam commented Apr 2, 2024 •

edited

Loading

kasper93 commented Oct 4, 2024 •

edited

Loading