Releases · ml-explore/mlx · GitHub

01 Feb 20:39

angeloskath

v0.1.0

Highlights

Memory use improvements:
- Gradient checkpointing for training with mx.checkpoint
- Better graph execution order
- Buffer donation

Core

Gradient checkpointing with mx.checkpoint
CPU only QR factorization mx.linalg.qr
Release Python GIL during mx.eval
Depth-based graph execution order
Lazy loading arrays from files
Buffer donation for reduced memory use
mx.diag, mx.diagonal
Breaking: array.shape is a Python tuple
GPU support for int64 and uint64 reductions
vmap over reductions and arg reduction:
- sum, prod, max, min, all, any
- argmax, argmin

NN

Softshrink activation

Bugfixes

Comparisons with inf work, and fix mx.isinf
Bug fix with RoPE cache
Handle empty Matmul on the CPU
Negative shape checking for mx.full
Correctly propagate NaN in some binary ops
- mx.logaddexp, mx.maximum, mx.minimum
Fix > 4D non-contiguous binary ops
Fix mx.log1p with inf input
Fix SGD to apply weight decay even with 0 momentum

Assets 2

25 Jan 19:48

angeloskath

v0.0.11

Highlights:

GGUF improvements:
- Native quantizations Q4_0, Q4_1, and Q8_0
- Metadata

Core

Support for reading and writing GGUF metadata
Native GGUF quantization (Q4_0, Q4_1, and Q8_0)
Quantize with group size of 32 (2x32, 4x32, and 8x32)

NN

Module.save_weights supports safetensors
nn.init package with several commonly used neural network initializers
Binary cross entropy and cross entropy losses can take probabilities as targets
Adafactor in nn.optimizers

Bugfixes

Fix isinf and friends for integer types
Fix array creation from list Python ints to int64, uint, and float32
Fix power VJP for 0 inputs
Fix out of bounds inf reads in gemv
mx.arange crashes on NaN inputs

Assets 2

18 Jan 20:02

awni

v0.0.10

Highlights:

Faster matmul: up to 2.5x faster for certain sizes, benchmarks
Fused matmul + addition (for faster linear layers)

Core

Quantization supports sizes other than multiples of 32
Faster GEMM (matmul)
ADMM primitive (fused addition and matmul)
mx.isnan, mx.isinf, isposinf, isneginf
mx.tile
VJPs for scatter_min and scatter_max
Multi output split primitive

NN

Losses: Gaussian negative log-likelihood

Misc

Performance enhancements for graph evaluation with lots of outputs
Default PRNG seed is based on current time instead of 0
Primitive VJP takes output as input. Reduces redundant work without need for simplification
PRNGs default seed based on system time rather than fixed to 0
Format boolean printing in Python style when in Python

Bugfixes

Scatter < 32 bit precision and integer overflow fix
Overflow with mx.eye
Report Metal out of memory issues instead of silent failure
Change mx.round to follow NumPy which rounds to even

Assets 2

11 Jan 22:07

awni

v0.0.9

Highlights:

Initial (and experimental) GGUF support
Support Python buffer protocol (easy interoperability with NumPy, Jax, Tensorflow, PyTorch, etc)
at[] syntax for scatter style operations: x.at[idx].add(y), (min, max, prod, etc)

Core

Array creation from other mx.array’s (mx.array([x, y]))
Complete support for Python buffer protocol
mx.inner, mx.outer
mx.logical_and, mx.logical_or, and operator overloads
Array at syntax for scatter ops
Better support for in-place operations (+=, *=, -=, ...)
VJP for scatter and scatter add
Constants (mx.pi, mx.inf, mx.newaxis, …)

NN

GLU activation
cosine_similarity loss
Cache for RoPE and ALiBi

Bugfixes / Misc

Fix data type with tri
Fix saving non-contiguous arrays
Fix graph retention for inlace state, and remove retain_graph
Multi-output primitives
Better support for loading devices

Assets 2

03 Jan 23:04

awni

v0.0.7

Core

Support for loading and saving HuggingFace's safetensor format
Transposed quantization matmul kernels
mlx.core.linalg sub-package with mx.linalg.norm (Frobenius, infininty, p-norms)
tensordot and repeat

NN

Layers
- Bilinear,Identity, InstanceNorm
- Dropout2D, Dropout3D
- more customizable Transformer (pre/post norm, dropout)
- More activations: SoftSign, Softmax, HardSwish, LogSoftmax
- Configurable scale in RoPE positional encodings
Losses: hinge, huber, log_cosh

Misc

Faster GPU reductions for certain cases
Change to memory allocation to allow swapping

Assets 2

22 Dec 02:39

angeloskath

v0.0.6

Core

quantize, dequantize, quantized_matmul
moveaxis, swapaxes, flatten
stack
floor, ceil, clip
tril, triu, tri
linspace

Optimizers

RMSProp, Adamax, Adadelta, Lion

NN

Layers: QuantizedLinear, ALiBi positional encodings
Losses: Label smoothing, Smooth L1 loss, Triplet loss

Misc

Bug fixes

Assets 2

13 Dec 22:33

awni

v0.0.5

Core ops remainder, eye, identity
Additional functionality in mlx.nn
- Losses: binary cross entropy, kl divergence, mse, l1
- Activations: PRELU, Mish, and several others
More optimizers: AdamW, Nesterov momentum, Adagrad
Bug fixes

Assets 2