fix(lmstudio): force encoding_format=float to avoid SDK base64 decode

The OpenAI SDK (≥6.x, since openai-node#1312) auto-injects `encoding_format: "base64"` into every embeddings request when the caller doesn't specify one, then unconditionally decodes the response with `toFloat32Array(embedding as unknown as string)`. LM Studio's Local Server ignores `encoding_format` and always returns a JSON array of floats. The SDK then runs `Buffer.from(<array>, 'base64')` — Node.js silently drops the encoding parameter for array inputs and clamps each float (<1.0) to uint8 0, producing a 4096-byte zero buffer that gets reinterpreted as a 1024-element Float32Array of zeros. Net effect: every LM Studio embedding came back as 1024 zeros regardless of the model's true dimension. Qdrant then rejected the upserts with `Vector dimension error: expected dim: <model>, got 1024`, and indexing silently failed with all points skipped. Fix: pass `encoding_format: "float"` explicitly. The SDK detects the user-provided value (hasUserProvidedEncodingFormat=true), skips the decode step, and returns LM Studio's float array as-is. Verified with Qwen3-Embedding-8B (4096-dim): all DEBUG_LMSTUDIO_EMBED log entries now show firstEmbeddingDim=4096, no skipped upserts.
2026-07-03 14:05:21 +02:00 · 2026-05-04 12:59:20 +02:00
parent f10786530f
commit bb141a0b3f
1 changed files with 11 additions and 0 deletions
@@ -179,9 +179,20 @@ export class LMStudioEmbeddingProvider implements EmbeddingProvider {
    // No `dimensions` parameter: LM Studio doesn't implement Matryoshka projection.
    // The model returns its native dimension and we trust the user to have set
    // EMBEDDING_DIMENSIONS to match.
+    //
+    // `encoding_format: "float"` is REQUIRED. The OpenAI SDK (6.x+) defaults to
+    // `encoding_format: "base64"` for performance, then unconditionally decodes the
+    // response with toFloat32Array(). LM Studio ignores `encoding_format` and always
+    // returns a plain JSON array of floats. The SDK's decode path then runs
+    // `Buffer.from(<array>, 'base64')` — Node.js silently drops the encoding for
+    // array inputs and clamps each float (<1.0) to uint8 0, producing a 4096-byte
+    // zero buffer that gets reinterpreted as a 1024-element Float32Array of zeros.
+    // Setting `encoding_format: "float"` makes the SDK skip the decode step entirely
+    // (see openai-node/src/resources/embeddings.ts: `if (hasUserProvidedEncodingFormat)`).
    const response = await client.embeddings.create({
      model,
      input: texts,
+      encoding_format: "float",
    });
    const sorted = response.data.sort((a, b) => a.index - b.index);
    return sorted.map((d) => d.embedding);