QwenLM Behavior
This page covers the toggles and options that control how IntenseRP interacts with QwenLM (chat.qwen.ai).
Modes (model IDs)
In IntenseRP Next v2, the model you select in SillyTavern is mostly a mode selector, not a true model picker.
For QwenLM, these model IDs map to simple behavior presets:
| Model ID | Behavior |
|---|---|
qwen-auto |
Uses your IntenseRP settings |
qwen-chat |
Forces Thinking off and never emits <think> |
qwen-reasoner |
Forces Thinking on (Send Thinking follows your setting) |
Real Qwen model selection (web UI)
IntenseRP can also switch QwenLM's real model picker in the web UI:
Settings -> Provider Behavior -> QwenLM -> Model
The list is intentionally "what Qwen shows in the dropdown". If your selected model is missing (UI rollout / region / UI change), IntenseRP logs a warning and keeps going.
Thinking
QwenLM exposes three thinking modes in the web UI: Auto, Thinking, and Fast.
IntenseRP keeps it simple:
- Enable Thinking -> selects Thinking
- Thinking disabled -> selects Fast
Enable Thinking
Settings -> Provider Behavior -> QwenLM -> Enable Thinking
Send Thinking
When enabled, IntenseRP includes QwenLM's thinking summaries in the response, wrapped in <think> tags.
Settings -> Provider Behavior -> QwenLM -> Send Thinking
Thinking content
QwenLM sends a short thinking summary stream. This is what IntenseRP forwards (not hidden internal chain-of-thought).
Search
QwenLM Web search can be toggled via IntenseRP.
Qwen streams tool/search payloads into the same response stream. For stability, IntenseRP strips these tool payloads, so search results are not sent to the client.
Settings -> Provider Behavior -> QwenLM -> Enable Search
Tools
QwenLM's Tools switch lives in the same + menu as Search and Upload.
IntenseRP keeps Tools off by default. If Enable Tools is turned on, the driver will try to flip Qwen's Tools switch on before sending. If Qwen shows the switch as disabled, IntenseRP stops trying for that request and leaves it off.
Settings -> Provider Behavior -> QwenLM -> Enable Tools
Experimental
Leave Enable Tools disabled unless you are intentionally testing it. Qwen's tool payloads are not a polished IntenseRP workflow yet, and anything tool/search-shaped in the stream is still treated defensively.
Separate from Search
Enable Tools is its own toggle. It doesn't replace Enable Search, and enabling one does not automatically enable the other.
Count Tokens
QwenLM reports token usage during the response stream. When enabled, IntenseRP captures those values and returns them in the OpenAI-style usage fields (prompt_tokens, completion_tokens, total_tokens). This is enabled by default.
Settings -> Provider Behavior -> QwenLM -> Count Tokens
File Upload Mode
Instead of typing your message into QwenLM's chat box, IntenseRP can upload it as a text file attachment. This is useful for very long prompts that might hit input limits.
Settings -> Provider Behavior -> QwenLM -> Send As Text File
QwenLM file uploads are flaky
QwenLM does not handle files very reliably right now. If you are sending important context (system prompts, lore, character sheets), you will usually get better results by keeping File Upload Mode off and sending plain text instead.
Text File Message
Optional text that IntenseRP pastes into QwenLM alongside the uploaded file.
Leave it empty to send a file-only message.
Settings -> Provider Behavior -> QwenLM -> Text File Message
Helps a lot
QwenLM treats files as attachments rather than part of the message content, so it can be helpful to include a short message like "Context attached, please read it before answering" to make sure the model knows to look at the file.
File Upload Timeout
After uploading, the send button can take a moment to become available. This setting controls how long IntenseRP waits (in seconds) before giving up.
Settings -> Provider Behavior -> QwenLM -> File Upload Timeout
Message Send Timeout
QwenLM doesn't render the send button until there's something to send (typed text or an uploaded file). This timeout controls how long IntenseRP waits for the send button to appear in normal (non-file) mode.
Settings -> Provider Behavior -> QwenLM -> Message Send Timeout (s)
Qwen Quirks & Timing
QwenLM's web UI can sometimes show that it's sending before the actual completion request appears on the network. When that happens too slowly, IntenseRP may think the click was swallowed even though Qwen is still waking up.
Completion Request Timeout
How long (in seconds) IntenseRP waits after clicking Send or Regenerate for QwenLM's completion request to appear.
Settings -> Provider Behavior -> QwenLM -> Quirks -> Completion Request Timeout (s)
| Default | 150 seconds |
| Minimum | 5 seconds |
If you're seeing QwenLM: completion request not observed, try raising this. This is different from the stream timeout: it only covers the gap before Qwen's backend request starts.
First Chunk Timeout
How long (in seconds) IntenseRP waits for QwenLM's response stream to produce its first chunk after the completion request has started.
Settings -> Provider Behavior -> QwenLM -> Quirks -> First Chunk Timeout (s)
| Default | 150 seconds |
| Minimum | 5 seconds |
If you're seeing timed out waiting for intercepted first chunk errors on QwenLM, this is the setting to raise.
Provider guardrails (recommended)
QwenLM has a couple of web settings that can silently change how your prompt gets sent (for example, turning big messages into file attachments).
To keep things predictable, IntenseRP tries to auto-disable these when the driver starts:
- Large Text as File
- Split Large Chunks
- Memory
- History Memory
If any of those get changed, you might see the Qwen tab reload once. That is normal.
Reuse Matching Chat
Reuse Matching Chat tries to keep chats tidy: when you send the exact same prompt twice in a row, IntenseRP clicks Qwen's "Regenerate" instead of creating a brand new chat. Especially useful if you swipe a lot in SillyTavern and want to hit ratelimits less.
Settings -> Provider Behavior -> QwenLM -> Reuse Matching Chat
Search Older Matching Chats
QwenLM also supports Provider Behavior -> QwenLM -> Search Older Matching Chats.
That keeps up to 7 older cached Qwen chats per account, so IntenseRP can reopen a matching older conversation and regenerate there instead of only checking the most recent one.
Delete Chat After Reply
If you want Qwen's chat list cleaned up automatically, IntenseRP can delete the completed Qwen chat after a successful reply finishes.
Settings -> Provider Behavior -> QwenLM -> Delete Chat After Reply
Slower requests
This adds extra cleanup work after each request, so it can slow requests down quite a bit.
No chat reuse here
This does not work together with Reuse Matching Chat or Search Older Matching Chats.
See also: Chat Auto-Deletion
Login notes
QwenLM supports Auto Login (email + password). Persistent Sessions are still recommended, since they reduce how often you need to log in.
See: Login & Sessions
UI language requirement
The QwenLM driver currently expects the Qwen web UI language to be English (en / en-US). If QwenLM is set to another language, IntenseRP may fail to find buttons/toggles reliably.
If you see a warning about QwenLM UI language:
- Change QwenLM language to English
- Reload the page (F5 / Ctrl+R)
- Retry / restart the browser from IntenseRP if needed
Per-message macros
You can add simple [[...]] macros to the latest user message in SillyTavern to override certain QwenLM Behavior settings for that request only.
All macros are stripped from the message before sending it to QwenLM.
| Macro | Effect |
|---|---|
[[think]], [[r1]] |
Force Thinking on |
[[nothink]], [[r0]] |
Force Thinking off |
[[search]] |
Force Search on |
[[nosearch]], [[no_search]] |
Force Search off |
[[tool]], [[tools]] |
Force Tools on |
[[notool]], [[notools]], [[no_tool]], [[no_tools]] |
Force Tools off |
[[file]] |
Force Send As Text File on |
[[nofile]] |
Force Send As Text File off |
Scope
Only macros from the latest user message apply. They do not persist across requests.
Quick Reference
| Setting | What It Does | Default |
|---|---|---|
| Model | Selects QwenLM's real model picker (UI) | Qwen3.5-Plus |
| Enable Thinking | Switches QwenLM into Thinking (vs Fast) | Off |
| Send Thinking | Includes thinking summaries in response | Off |
| Count Tokens | Returns token usage in API responses | On |
| Enable Search | Enables QwenLM Web search | Off |
| Enable Tools | Enables QwenLM Tools when available | Off |
| Send As Text File | Uploads prompt as .txt | Off |
| Text File Message | Text pasted alongside uploaded file | (empty) |
| File Upload Timeout | Seconds to wait after upload | 20 |
| Message Send Timeout (s) | Seconds to wait for send button | 8 |
| Completion Request Timeout (s) | Seconds to wait for Qwen's backend request | 150 |
| First Chunk Timeout (s) | Seconds to wait for Qwen's stream to start | 150 |
| Reuse Matching Chat | Regenerates on duplicate prompts | Off |
| Delete Chat After Reply | Deletes the completed Qwen chat after a successful reply | Off |