Merge pull request #448 from kolbytn/fix_embeddings

Fixed embeddings
2025-07-29 03:15:27 +02:00 · 2025-02-17 15:00:33 -06:00 · 2025-02-17 15:00:33 -06:00 · 1f00e685c0
commit 1f00e685c0
parent 7970397517 f0313695b5
8 changed files with 108 additions and 69 deletions
--- a/README.md
+++ b/README.md
@ -2,11 +2,11 @@

 Crafting minds for Minecraft with LLMs and [Mineflayer!](https://prismarinejs.github.io/mineflayer/#/)

-[FAQ](https://github.com/kolbytn/mindcraft/blob/main/FAQ.md) | [Discord Support](https://discord.gg/mp73p35dzC) | [Blog Post](https://kolbynottingham.com/mindcraft/) | [Contributor TODO](https://github.com/users/kolbytn/projects/1)
+[FAQ](https://github.com/kolbytn/mindcraft/blob/main/FAQ.md) | [Discord Support](https://discord.gg/mp73p35dzC) | [Video Tutorial](https://www.youtube.com/watch?v=gRotoL8P8D8) | [Blog Post](https://kolbynottingham.com/mindcraft/) | [Contributor TODO](https://github.com/users/kolbytn/projects/1)


-> [!WARNING]
-Do not connect this bot to public servers with coding enabled. This project allows an LLM to write/execute code on your computer. While the code is sandboxed, it is still vulnerable to injection attacks on public servers. Code writing is disabled by default, you can enable it by setting `allow_insecure_coding` to `true` in `settings.js`. We strongly recommend running with additional layers of security such as docker containers. Ye be warned.
+> [!Caution]
+Do not connect this bot to public servers with coding enabled. This project allows an LLM to write/execute code on your computer. The code is sandboxed, but still vulnerable to injection attacks. Code writing is disabled by default, you can enable it by setting `allow_insecure_coding` to `true` in `settings.js`. Ye be warned.

 ## Requirements

@ -30,30 +30,30 @@ Do not connect this bot to public servers with coding enabled. This project allo

 If you encounter issues, check the [FAQ](https://github.com/kolbytn/mindcraft/blob/main/FAQ.md) or find support on [discord](https://discord.gg/mp73p35dzC). We are currently not very responsive to github issues.

-## Customization
+## Model Customization

 You can configure project details in `settings.js`. [See file.](settings.js)

-You can configure the agent's name, model, and prompts in their profile like `andy.json`.
+You can configure the agent's name, model, and prompts in their profile like `andy.json` with the `model` field. For comprehensive details, see [Model Specifications](#model-specifications).

 | API | Config Variable | Example Model name | Docs |
 |------|------|------|------|
-| OpenAI | `OPENAI_API_KEY` | `gpt-4o-mini` | [docs](https://platform.openai.com/docs/models) |
-| Google | `GEMINI_API_KEY` | `gemini-pro` | [docs](https://ai.google.dev/gemini-api/docs/models/gemini) |
-| Anthropic | `ANTHROPIC_API_KEY` | `claude-3-haiku-20240307` | [docs](https://docs.anthropic.com/claude/docs/models-overview) |
-| Replicate | `REPLICATE_API_KEY` | `replicate/meta/meta-llama-3-70b-instruct` | [docs](https://replicate.com/collections/language-models) |
-| Ollama (local) | n/a | `llama3` | [docs](https://ollama.com/library) |
-| Groq | `GROQCLOUD_API_KEY` | `groq/mixtral-8x7b-32768` | [docs](https://console.groq.com/docs/models) |
-| Hugging Face | `HUGGINGFACE_API_KEY` | `huggingface/mistralai/Mistral-Nemo-Instruct-2407` | [docs](https://huggingface.co/models) |
-| Novita AI | `NOVITA_API_KEY` | `gryphe/mythomax-l2-13b` | [docs](https://novita.ai/model-api/product/llm-api?utm_source=github_mindcraft&utm_medium=github_readme&utm_campaign=link) |
-| Qwen | `QWEN_API_KEY` | `qwen-max` | [Intl.](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api)/[cn](https://help.aliyun.com/zh/model-studio/getting-started/models) |
-| Mistral | `MISTRAL_API_KEY` | `mistral-large-latest` | [docs](https://docs.mistral.ai/getting-started/models/models_overview/) |
-| xAI | `XAI_API_KEY` | `grok-beta` | [docs](https://docs.x.ai/docs) |
+| `openai` | `OPENAI_API_KEY` | `gpt-4o-mini` | [docs](https://platform.openai.com/docs/models) |
+| `google` | `GEMINI_API_KEY` | `gemini-pro` | [docs](https://ai.google.dev/gemini-api/docs/models/gemini) |
+| `anthropic` | `ANTHROPIC_API_KEY` | `claude-3-haiku-20240307` | [docs](https://docs.anthropic.com/claude/docs/models-overview) |
+| `replicate` | `REPLICATE_API_KEY` | `replicate/meta/meta-llama-3-70b-instruct` | [docs](https://replicate.com/collections/language-models) |
+| `ollama` (local) | n/a | `llama3` | [docs](https://ollama.com/library) |
+| `groq` | `GROQCLOUD_API_KEY` | `groq/mixtral-8x7b-32768` | [docs](https://console.groq.com/docs/models) |
+| `huggingface` | `HUGGINGFACE_API_KEY` | `huggingface/mistralai/Mistral-Nemo-Instruct-2407` | [docs](https://huggingface.co/models) |
+| `novita` | `NOVITA_API_KEY` | `gryphe/mythomax-l2-13b` | [docs](https://novita.ai/model-api/product/llm-api?utm_source=github_mindcraft&utm_medium=github_readme&utm_campaign=link) |
+| `qwen` | `QWEN_API_KEY` | `qwen-max` | [Intl.](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api)/[cn](https://help.aliyun.com/zh/model-studio/getting-started/models) |
+| `xai` | `MISTRAL_API_KEY` | `mistral-large-latest` | [docs](https://docs.mistral.ai/getting-started/models/models_overview/) |
+| `deepseek` | `XAI_API_KEY` | `grok-beta` | [docs](https://docs.x.ai/docs) |

 If you use Ollama, to install the models used by default (generation and embedding), execute the following terminal command:
 `ollama pull llama3 && ollama pull nomic-embed-text`

-## Online Servers
+### Online Servers
 To connect to online servers your bot will need an official Microsoft/Minecraft account. You can use your own personal one, but will need another account if you want to connect too and play with it. To connect, change these lines in `settings.js`:
 ```javascript
 "host": "111.222.333.444",
@ -62,7 +62,7 @@ To connect to online servers your bot will need an official Microsoft/Minecraft

 // rest is same...
 ```
-> [!CAUTION]
+> [!Important]
 > The bot's name in the profile.json must exactly match the Minecraft profile name! Otherwise the bot will spam talk to itself.

 To use different accounts, Mindcraft will connect with the account that the Minecraft launcher is currently using. You can switch accounts in the launcer, then run `node main.js`, then switch to your main account after the bot has connected.
@ -87,25 +87,17 @@ When running in docker, if you want the bot to join your local minecraft server,

 To connect to an unsupported minecraft version, you can try to use [viaproxy](services/viaproxy/README.md)

-## Bot Profiles
+# Bot Profiles

 Bot profiles are json files (such as `andy.json`) that define:

-1. Bot backend LLMs to use for chat and embeddings.
+1. Bot backend LLMs to use for talking, coding, and embedding.
 2. Prompts used to influence the bot's behavior.
 3. Examples help the bot perform tasks.

-### Specifying Profiles via Command Line
+## Model Specifications

-By default, the program will use the profiles specified in `settings.js`. You can specify one or more agent profiles using the `--profiles` argument:
-
-```bash
-node main.js --profiles ./profiles/andy.json ./profiles/jill.json
-```
-
-### Model Specifications
-
-LLM models can be specified as simply as `"model": "gpt-4o"`. However, you can specify different models for chat, coding, and embeddings. 
+LLM models can be specified simply as `"model": "gpt-4o"`. However, you can use different models for chat, coding, and embeddings. 
 You can pass a string or an object for these fields. A model object must specify an `api`, and optionally a `model`, `url`, and additional `params`.

 ```json
@ -131,11 +123,21 @@ You can pass a string or an object for these fields. A model object must specify

 ```

-`model` is used for chat, `code_model` is used for newAction coding, and `embedding` is used to embed text for example selection. If `code_model` is not specified, then it will use `model` for coding.
+`model` is used for chat, `code_model` is used for newAction coding, and `embedding` is used to embed text for example selection. If `code_model` or `embedding` are not specified, they will use `model` by default. Not all APIs have an embedding model.

-All apis have default models and urls, so those fields are optional. Note some apis have no embedding model, so they will default to word overlap to retrieve examples. 
+All apis have default models and urls, so those fields are optional. The `params` field is optional and can be used to specify additional parameters for the model. It accepts any key-value pairs supported by the api. Is not supported for embedding models.

-The `params` field is optional and can be used to specify additional parameters for the model. It accepts any key-value pairs supported by the api. Is not supported for embedding models.
+## Embedding Models
+
+Embedding models are used to embed and efficiently select relevant examples for conversation and coding.
+
+Supported Embedding APIs: `openai`, `google`, `replicate`, `huggingface`, `groq`, `novita`
+
+If you try to use an unsupported model, then it will default to a simple word-overlap method. Expect reduced performance, recommend mixing APIs to ensure embedding support.
+
+## Specifying Profiles via Command Line
+
+By default, the program will use the profiles specified in `settings.js`. You can specify one or more agent profiles using the `--profiles` argument: `node main.js --profiles ./profiles/andy.json ./profiles/jill.json`

 ## Patches

--- a/src/agent/coder.js
+++ b/src/agent/coder.js
@ -35,9 +35,9 @@ export class Coder {
        while ((match = skillRegex.exec(code)) !== null) {
            skills.push(match[1]);
        }
-        const allDocs = await this.agent.prompter.skill_libary.getRelevantSkillDocs();
-        //lint if the function exists
-        const missingSkills = skills.filter(skill => !allDocs.includes(skill));
+        const allDocs = await this.agent.prompter.skill_libary.getAllSkillDocs();
+        // check function exists
+        const missingSkills = skills.filter(skill => !!allDocs[skill]);
        if (missingSkills.length > 0) {
            result += 'These functions do not exist. Please modify the correct function name and try again.\n';
            result += '### FUNCTIONS NOT FOUND ###\n';
@ -163,7 +163,6 @@ export class Coder {
        for (let i=0; i<5; i++) {
            if (this.agent.bot.interrupt_code)
                return interrupt_return;
-            console.log(messages)
            let res = await this.agent.prompter.promptCoding(JSON.parse(JSON.stringify(messages)));
            if (this.agent.bot.interrupt_code)
                return interrupt_return;
--- a/src/agent/commands/actions.js
+++ b/src/agent/commands/actions.js
@ -33,8 +33,10 @@ export const actionsList = [
        },
        perform: async function (agent, prompt) {
            // just ignore prompt - it is now in context in chat history
-            if (!settings.allow_insecure_coding)
+            if (!settings.allow_insecure_coding) { 
+                agent.openChat('newAction is disabled. Enable with allow_insecure_coding=true in settings.js');
                return 'newAction not allowed! Code writing is disabled in settings. Notify the user.';
+             }
            return await agent.coder.generateCode(agent.history);
        }
    },
--- a/src/agent/library/skill_library.js
+++ b/src/agent/library/skill_library.js
@ -1,34 +1,53 @@
 import { cosineSimilarity } from '../../utils/math.js';
 import { getSkillDocs } from './index.js';
+import { wordOverlapScore } from '../../utils/text.js';

 export class SkillLibrary {
    constructor(agent,embedding_model) {
        this.agent = agent;
        this.embedding_model = embedding_model;
        this.skill_docs_embeddings = {};
+        this.skill_docs = null;
    }
    async initSkillLibrary() {
        const skillDocs = getSkillDocs();
-        const embeddingPromises = skillDocs.map((doc) => {
-            return (async () => {
-                let func_name_desc = doc.split('\n').slice(0, 2).join('');
-                this.skill_docs_embeddings[doc] = await this.embedding_model.embed(func_name_desc);
-            })();
-        });
-        await Promise.all(embeddingPromises);
+        this.skill_docs = skillDocs;
+        if (this.embedding_model) {
+            const embeddingPromises = skillDocs.map((doc) => {
+                return (async () => {
+                    let func_name_desc = doc.split('\n').slice(0, 2).join('');
+                    this.skill_docs_embeddings[doc] = await this.embedding_model.embed(func_name_desc);
+                })();
+            });
+            await Promise.all(embeddingPromises);
+        }
+    }
+
+    async getAllSkillDocs() {
+        return this.skill_docs;
    }

    async getRelevantSkillDocs(message, select_num) {
-        let latest_message_embedding = '';
-        if(message) //message is not empty, get the relevant skill docs, else return all skill docs
-            latest_message_embedding = await this.embedding_model.embed(message);
-
-        let skill_doc_similarities = Object.keys(this.skill_docs_embeddings)
+        if(!message) // use filler message if none is provided
+            message = '(no message)';
+        let skill_doc_similarities = [];
+        if (!this.embedding_model) {
+            skill_doc_similarities = Object.keys(this.skill_docs)
+                .map(doc_key => ({
+                    doc_key,
+                    similarity_score: wordOverlapScore(message, this.skill_docs[doc_key])
+                }))
+                .sort((a, b) => b.similarity_score - a.similarity_score);
+        }
+        else {
+            let latest_message_embedding = '';
+            skill_doc_similarities = Object.keys(this.skill_docs_embeddings)
            .map(doc_key => ({
                doc_key,
                similarity_score: cosineSimilarity(latest_message_embedding, this.skill_docs_embeddings[doc_key])
            }))
            .sort((a, b) => b.similarity_score - a.similarity_score);
+        }

        let length = skill_doc_similarities.length;
        if (typeof select_num !== 'number' || isNaN(select_num) || select_num < 0) {
@ -42,6 +61,4 @@ export class SkillLibrary {
        relevant_skill_docs += selected_docs.map(doc => `${doc.doc_key}`).join('\n### ');
        return relevant_skill_docs;
    }
-
-
 }
--- a/src/agent/library/skills.js
+++ b/src/agent/library/skills.js
@ -111,6 +111,18 @@ export async function craftRecipe(bot, itemName, num=1) {
    return true;
 }

+export async function wait(seconds) {
+    /**
+     * Waits for the given number of seconds.
+     * @param {number} seconds, the number of seconds to wait.
+     * @returns {Promise<boolean>} true if the wait was successful, false otherwise.
+     * @example
+     * await skills.wait(10);
+     **/
+    // setTimeout is disabled to prevent unawaited code, so this is a safe alternative
+    await new Promise(resolve => setTimeout(resolve, seconds * 1000));
+    return true;
+}

 export async function smeltItem(bot, itemName, num=1) {
    /**
--- a/src/models/prompter.js
+++ b/src/models/prompter.js
@ -90,14 +90,21 @@ export class Prompter {
                this.embedding_model = new Qwen(embedding.model, embedding.url);
            else if (embedding.api === 'mistral')
                this.embedding_model = new Mistral(embedding.model, embedding.url);
+            else if (embedding.api === 'huggingface')
+                this.embedding_model = new HuggingFace(embedding.model, embedding.url);
+            else if (embedding.api === 'groq')
+                this.embedding_model = new GroqCloudAPI(embedding.model, embedding.url);
+            else if (embedding.api === 'novita')
+                this.embedding_model = new Novita(embedding.model, embedding.url);
            else {
                this.embedding_model = null;
-                console.log('Unknown embedding: ', embedding ? embedding.api : '[NOT SPECIFIED]', '. Using word overlap.');
+                let embedding_name = embedding ? embedding.api : '[NOT SPECIFIED]'
+                console.warn('Unsupported embedding: ' + embedding_name + '. Using word-overlap instead, expect reduced performance. Recommend using a supported embedding model. See Readme.');
            }
        }
        catch (err) {
-            console.log('Warning: Failed to initialize embedding model:', err.message);
-            console.log('Continuing anyway, using word overlap instead.');
+            console.warn('Warning: Failed to initialize embedding model:', err.message);
+            console.log('Continuing anyway, using word-overlap instead.');
            this.embedding_model = null;
        }
        this.skill_libary = new SkillLibrary(agent, this.embedding_model);
--- a/src/utils/examples.js
+++ b/src/utils/examples.js
@ -1,5 +1,5 @@
 import { cosineSimilarity } from './math.js';
-import { stringifyTurns } from './text.js';
+import { stringifyTurns, wordOverlapScore } from './text.js';

 export class Examples {
    constructor(model, select_num=2) {
@ -18,17 +18,6 @@ export class Examples {
        return messages.trim();
    }

-    getWords(text) {
-        return text.replace(/[^a-zA-Z ]/g, '').toLowerCase().split(' ');
-    }
-
-    wordOverlapScore(text1, text2) {
-        const words1 = this.getWords(text1);
-        const words2 = this.getWords(text2);
-        const intersection = words1.filter(word => words2.includes(word));
-        return intersection.length / (words1.length + words2.length - intersection.length);
-    }
-
    async load(examples) {
        this.examples = examples;
        if (!this.model) return; // Early return if no embedding model
@ -68,8 +57,8 @@ export class Examples {
        }
        else {
            this.examples.sort((a, b) => 
-                this.wordOverlapScore(turn_text, this.turnsToText(b)) -
-                this.wordOverlapScore(turn_text, this.turnsToText(a))
+                wordOverlapScore(turn_text, this.turnsToText(b)) -
+                wordOverlapScore(turn_text, this.turnsToText(a))
            );
        }
        let selected = this.examples.slice(0, this.select_num);
--- a/src/utils/text.js
+++ b/src/utils/text.js
@ -26,6 +26,17 @@ export function toSinglePrompt(turns, system=null, stop_seq='***', model_nicknam
    return prompt;
 }

+function _getWords(text) {
+    return text.replace(/[^a-zA-Z ]/g, '').toLowerCase().split(' ');
+}
+
+export function wordOverlapScore(text1, text2) {
+    const words1 = _getWords(text1);
+    const words2 = _getWords(text2);
+    const intersection = words1.filter(word => words2.includes(word));
+    return intersection.length / (words1.length + words2.length - intersection.length);
+}
+
 // ensures stricter turn order and roles:
 // - system messages are treated as user messages and prefixed with SYSTEM:
 // - combines repeated messages from users