After loading my custom model - unsupportedTokenizer error

Question

Created 1w

Replies 3

Boosts 0

Participants 2

In Oct25, using mlx_lm.lora I created an adapter and a fused model uploaded to Huggingface. I was able to incorporate this model into my SwiftUI app using the mlx package. MLX-libraries 2.25.8. My base LLM was mlx-community/Mistral-7B-Instruct-v0.3-4bit.

Looking at LLMModelFactory.swift the current version 2.29.1 the only changes are the addition of a few models.

The earlier model was called: pharmpk/pk-mistral-7b-v0.3-4bit The new model is called: pharmpk/pk-mistral-2026-03-29

The base model (mlx-community/Mistral-7B-Instruct-v0.3-4bit.) must still be available. Could the error 'unsupportedTokenizer' be related to changes in the mlx package? I noticed mention of splitting the package into two parts but don't see anything at github.

Feeling rather lost. Does anone have any thoguths and/or suggestions.

Thanks, David

Boost

Answer 1

dbourne5321 OP

1w

Same code with MLX libraries 2.25.8 but new model I get the same error. Might need to revisit the new model

0

Answer 2

eddiewangyw OP

1d

Tokenizer breakage across mlx versions is a recurring pain point — the tokenizer factory gets updated without guaranteed backward compat for custom-fused models. Check if tokenizer_config.json in your fused model specifies a tokenizer_class that 2.29.1 still recognizes. Manually setting the tokenizer type in LLMModelFactory registration usually gets around it.

0

Answer 3

dbourne5321 OP

1d

Thank you for your response.

In tokenizer_config.json I have the line "tokenizer_class": "TokenizersBackend",

In my Swift code I have

        let modelFactory = LLMModelFactory.shared
        
        let modelConfiguration = LLMRegistry.shared.configuration(id: "pharmpk/pk-mistral-7b-v0.3-4bit")
        
// Load the model off the main actor, then assign on the main actor
        let loaded = try await modelFactory.loadContainer(configuration: modelConfiguration)
        { progress in
            print("Downloading progress: \(progress.fractionCompleted * 100)%")
        }
        await MainActor.run {
            self.model = loaded
        }

It appears the difference between code that works and what doesn't is the format of the train/valid.json I'm providing to mlx_;m.lora

Text type input seems to work

{"text": "This is an example for the model."}

I thought I might get a better model using the chat format

{"messages":[{"role":"user","content":"What is pharmacokinetics?"},{"role":"assistant","content":"Pharmacokinetics is the study of the time course of drug absorption, distribution, metabolism, and excretion (ADME). It involves the mathematical analysis of these processes to describe and predict drug concentrations in the body over time. The term comes from the Greek words 'pharmakon' (drug) and 'kinesis' (movement), literally meaning the movement of drugs through the body."}]}

This is when I get the error after loading the 'new-fused' model

I don't know where to add the Manual setting of tokenizer type, in my Swift code?

Thanks

0