mirror of
https://github.com/giancarloerra/socraticode.git
synced 2026-07-03 14:05:21 +02:00
fix(lmstudio): force encoding_format=float to avoid SDK base64 decode
The OpenAI SDK (≥6.x, since openai-node#1312) auto-injects `encoding_format: "base64"` into every embeddings request when the caller doesn't specify one, then unconditionally decodes the response with `toFloat32Array(embedding as unknown as string)`. LM Studio's Local Server ignores `encoding_format` and always returns a JSON array of floats. The SDK then runs `Buffer.from(<array>, 'base64')` — Node.js silently drops the encoding parameter for array inputs and clamps each float (<1.0) to uint8 0, producing a 4096-byte zero buffer that gets reinterpreted as a 1024-element Float32Array of zeros. Net effect: every LM Studio embedding came back as 1024 zeros regardless of the model's true dimension. Qdrant then rejected the upserts with `Vector dimension error: expected dim: <model>, got 1024`, and indexing silently failed with all points skipped. Fix: pass `encoding_format: "float"` explicitly. The SDK detects the user-provided value (hasUserProvidedEncodingFormat=true), skips the decode step, and returns LM Studio's float array as-is. Verified with Qwen3-Embedding-8B (4096-dim): all DEBUG_LMSTUDIO_EMBED log entries now show firstEmbeddingDim=4096, no skipped upserts.
This commit is contained in:
@@ -179,9 +179,20 @@ export class LMStudioEmbeddingProvider implements EmbeddingProvider {
|
||||
// No `dimensions` parameter: LM Studio doesn't implement Matryoshka projection.
|
||||
// The model returns its native dimension and we trust the user to have set
|
||||
// EMBEDDING_DIMENSIONS to match.
|
||||
//
|
||||
// `encoding_format: "float"` is REQUIRED. The OpenAI SDK (6.x+) defaults to
|
||||
// `encoding_format: "base64"` for performance, then unconditionally decodes the
|
||||
// response with toFloat32Array(). LM Studio ignores `encoding_format` and always
|
||||
// returns a plain JSON array of floats. The SDK's decode path then runs
|
||||
// `Buffer.from(<array>, 'base64')` — Node.js silently drops the encoding for
|
||||
// array inputs and clamps each float (<1.0) to uint8 0, producing a 4096-byte
|
||||
// zero buffer that gets reinterpreted as a 1024-element Float32Array of zeros.
|
||||
// Setting `encoding_format: "float"` makes the SDK skip the decode step entirely
|
||||
// (see openai-node/src/resources/embeddings.ts: `if (hasUserProvidedEncodingFormat)`).
|
||||
const response = await client.embeddings.create({
|
||||
model,
|
||||
input: texts,
|
||||
encoding_format: "float",
|
||||
});
|
||||
const sorted = response.data.sort((a, b) => a.index - b.index);
|
||||
return sorted.map((d) => d.embedding);
|
||||
|
||||
Reference in New Issue
Block a user