Models & limits

Last updated: 31 May 2026

The short version. Both Lite and Pro include the Keyda AI cloud pool — no API key needed. Lite gets 200 requests per day; Pro gets 1,000 requests per day and 500,000 tokens per day, with up to 2,048 tokens per single response. We rotate models to whatever is current state-of-the-art so you always have the best — you don't have to pick.

1. What's included with Lite & Pro

Both Lite and Pro proxy your AI calls through our infrastructure using our keys, so you never have to manage your own API key. We support the most capable model from each of the leading AI labs and keep them upgraded as new versions launch. The only difference is the daily request quota: 200/day on Lite, 1,000/day on Pro.

2. Current cloud models

These are the cloud models live today. The list is updated whenever providers ship a new version — Lite and Pro users get the upgrade automatically at no extra cost.

Provider	Models	Best for
OpenAI	GPT 5.5, GPT 5.4 Pro, GPT 5.4 Mini / Nano, GPT 4.1, GPT 4.1 Mini	Reasoning, multimodal, daily-driver writing
Anthropic	Claude Opus 4.7, Claude Sonnet 4.6, Claude Haiku 4.5	Long-context, code, balanced writing, fastest replies
Google	Gemini 3.1 Pro, Gemini 3.1 Flash Lite, Gemini 2.5 Pro / Flash / Flash Lite	Multimodal, price-performance, fastest inference
xAI	Grok 4, Grok 4 Fast (reasoning / non-reasoning), Grok Code Fast	Real-time knowledge, fast reasoning, agentic coding
Groq	Llama 3.3 70B, Llama 3.1 8B Instant, GPT-OSS 120B / 20B (via Groq)	Ultra-low-latency open-source inference

Model availability can change without notice if a provider deprecates a model or releases a successor. We keep the line-up state-of-the-art — Lite and Pro users do not need to track this.

3. On-device models (Lite & Pro)

These models run entirely on your phone — no network, no token cost, no rate limit. They are smaller than cloud models but excellent for short replies, grammar fixes, summaries and quick rephrases.

Gemma 3n (best quality) — Google's on-device flagship; recommended for most users.
Gemma 3 1B (fast) — smaller and faster; great for older devices.
Additional optimised open-source models added as the on-device ecosystem matures.

On-device image generation is also included on Lite and Pro for sticker creation — fully offline.

4. Daily limits

Limit	Lite	Pro	Notes
Cloud AI requests / day	200	1,000	Resets at 00:00 IST
Tokens per day	—	500,000	Sum of input + output tokens across all cloud calls
Tokens per single response	4,096	2,048	Per-request output ceiling
On-device usage	Unlimited	Unlimited	Runs entirely on your phone — no quota
BYOK usage (own key)	Unlimited	Unlimited	Bound only by your provider's own limits

Quotas reset daily at 00:00 IST. Once the daily request cap is reached, cloud AI calls are paused until midnight — on-device models and BYOK remain usable without restriction. If you regularly hit your cap, upgrading to Pro gives 5× more requests per day.

5. How model selection works

By default Keyda picks a sensible model per provider for you (the current "daily driver" for that lab). You can override the default per provider from Settings → AI → Model inside the keyboard if you want a specific model. If a chosen model fails or is deprecated, Keyda falls back to the next-best model from the same provider automatically.

6. Privacy

Your prompts and responses are not used to train any model, ours or the providers'. We log only the request count, token count, model name and timestamp for billing and abuse-prevention — never the content of your messages. See the Privacy Policy for full detail.

7. Changes

We may add, swap or retire models at any time as the AI landscape evolves. Material changes (such as a daily-cap reduction) will be announced in-app and on this page. The "Last updated" date at the top reflects the current version.

8. Contact

Questions about which model to use, or hit a limit? Email support@keyda.in or use the Contact form.