MPS device limitation error when running TTS on Apple Silicon Mac #228

kiwamizamurai · 2025-01-13T12:07:41Z

Environment

OS: macOS 14.7
Hardware: Apple M3 Pro
Python: 3.11.11
MeloTTS: 0.1.2

Dependencies

dependencies = [
"transformers==4.27.4",
"torch>=2.0.0",
"sentencepiece>=0.1.99",
"click>=8.1.7",
"rich>=13.7.0",
"pydantic>=2.6.0",
"melotts @ git+https://github.com/myshell-ai/MeloTTS.git",
"unidic>=1.1.0",
"sounddevice>=0.5.1",
"nltk>=3.8.1",
]

Issue Description

When trying to run TTS on an Apple Silicon Mac using the MPS (Metal Performance Shaders) device, the following error occurs:

Error: Output channels > 65536 not supported at the MPS device. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

This error occurs during the speech synthesis process when calling tts_to_file() method.

Steps to Reproduce

Initialize TTS with device='auto' or device='mps'
Call tts_to_file() with any English text
Error occurs during model inference

Current Workaround

Currently, we have two workarounds:

Set environment variable: PYTORCH_ENABLE_MPS_FALLBACK=1
Force CPU usage by initializing TTS with device='cpu'

Both workarounds result in slower performance compared to potential MPS acceleration.

Additional Context

This seems to be related to a limitation in PyTorch's MPS backend regarding the maximum number of output channels. It would be beneficial if the model architecture could be adjusted to work within MPS device limitations, or if there's a way to optimize the operations to stay under the 65536 channel limit.

Code Example

from melo.api import TTS

# This fails on MPS
engine = TTS(language='EN', device='auto')
audio = engine.tts_to_file("Test text", speaker_id=0, output_path=None)

# Current workaround
engine = TTS(language='EN', device='cpu')  # Force CPU usage
audio = engine.tts_to_file("Test text", speaker_id=0, output_path=None)

The text was updated successfully, but these errors were encountered:

yukiarimo · 2025-01-21T21:00:13Z

Something is wrong with your model or installation. I installed everything the same way on Colab and macOS and it works perfectly with the same model. All MPS, CUDA, and CPU! Try reinstalling or using different versions of Pips or Python

coldfire84 · 2025-01-22T13:34:14Z

I have the same issue:

M1 MacBook Pro, running Sequoia 15.2 (had the same issue w/ OS version 15.1).
Python 3.10.11, torch 2.5.1

This doesn't look like an isolated issue (associated PR).

@kiwamizamurai were you able to solve this?

kiwamizamurai · 2025-01-22T21:41:25Z

@coldfire84

# Current workaround
engine = TTS(language='EN', device='cpu')  # Force CPU usage
audio = engine.tts_to_file("Test text", speaker_id=0, output_path=None)

You can avoid the issue like this

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPS device limitation error when running TTS on Apple Silicon Mac #228

MPS device limitation error when running TTS on Apple Silicon Mac #228

kiwamizamurai commented Jan 13, 2025

yukiarimo commented Jan 21, 2025

coldfire84 commented Jan 22, 2025

kiwamizamurai commented Jan 22, 2025

MPS device limitation error when running TTS on Apple Silicon Mac #228

MPS device limitation error when running TTS on Apple Silicon Mac #228

Comments

kiwamizamurai commented Jan 13, 2025

Environment

Dependencies

Issue Description

Steps to Reproduce

Current Workaround

Additional Context

Code Example

yukiarimo commented Jan 21, 2025

coldfire84 commented Jan 22, 2025

kiwamizamurai commented Jan 22, 2025