Class Tokenizer
java.lang.Object
com.codename1.ai.Tokenizer
Rough best-effort token counting. Useful for the common case of "am I likely to exceed this model's context window?" without shipping the full BPE table (cl100k_base is ~1.7 MB which is substantial for a mobile binary).
The rule of thumb is 1 token ~= 4 characters of English text, which holds within ~10-15% for typical chat traffic. For non-Latin scripts the ratio is closer to 1:1, so we clamp the lower bound at the rough number of words. Apps that need exact accounting should fetch a usage value from the API response and adjust their budget.
-
Method Summary
Modifier and TypeMethodDescriptionstatic intApproximate token count fortext.static intestimateMessages(List<ChatMessage> messages) Estimate the prompt-tokens cost of an entire conversation.
-
Method Details
-
estimate
Approximate token count fortext. -
estimateMessages
Estimate the prompt-tokens cost of an entire conversation. Adds a small fixed overhead per message to approximate the role / formatting tokens the provider includes.
-