Thank you @jcigala!
Okay, here is my understanding of AI so far:
AI doesn’t actually understand text, only numbers, so when you are chatting with ChatGPT for example, he is actually transforming text to numbers behind the scenes. These numbers are called a vector(eg.: 1, 0.5, -0.9…), hence a vector database is a database that stores these numbers.
So what is RAG(Retrieval Augmented Generation)?
RAG is a way of enhancing the AI to respond from your knowledge base. How does it work?
- You convert your knowledge base(PDFs, FAQ, product database etc…) to embeddings(this is just a fancy name for vectors, basically you convert text to numbers).
- The user asks a question. You search the vector database and return the embeddings most similar to the question.
- You send the question along with the embeddings to an LLM to format a nice looking response.
That sounds like a lot of work, I don’t want to be an AI engineer, I want to work less, not more.
That’s what Langchain is for. Basically, it’s a framework that does all of these low level tasks for you. It’s not an AI itself, but more like an interface to AI. You use it to communicate to AI(any LLM model interchangeably), convert text to vectors, search the vector database and other AI stuff. Langchain4j is the one for Java developers.
Okay, but I don’t need AI to answer questions, I need him to talk to my fiance and schedule all of our date nights and put them in my calendar.
Those are called Tools in Langchain(or Functions for OpenAI). Basically a tool enables the AI to call external APIs, for example to create events in your calendar or get your intimate messages from WhatsApp. The problem is that these tools are not AI native, they are Langchain specific, meaning that if you want to give up the Langchain framework, you will lose the tool functionality. Unless you use OpenAI, which has it’s own tools called Functions, but then you will need to port the code between these 2, so again a pain in the groin.
Ha, gotcha you now, the AI-tool communication was already standardized. Haven’t you heard about MCP?
Indeed, the very new and trendy Model Context Protocol created by Anthropic seems to be the most promising standardization of AI-tool communication so far. There are still a few drawbacks to it though. First the LLM needs to be adapted for it. Currently OpenAI and Anthropic seems to be the ones that officially support it. Second, it’s the job of the AI to call the tools, you just specify what your tools can do, and he decides when to call and use them. Now for enterprise or more sensitive use cases you might need a bit more control over the workflow.
Nevertheless, MCP is still a great addition, as it may simplify your AI-tools interface, so you only need to write it once, and you won’t need to rewrite it in case you need to change the Langchain framework. And you can still use them together, MCP for tool communication, and Langchain for managing the workflow, so you have the best of 2 worlds.
Seems like @heguangyong wrote an MCP server for Moqui, so maybe he can share his experience so far? GitHub - heguangyong/moqui-mcp
Bonus
What about using AI to search products instead of the current Elasticsearch implementation?
Depends on the use case. AI search would be more user friendly, since you can do queries in natural language, but the main drawback is that it is a lot more computationally demanding compared to a traditional Elastic search.
Note: So far, in my opinion, MCP seems to be the most practical, easiest and fastest solution to have a natural language interface for Moqui. Combined with a local LLM(eg. ollama with open webui), you also get data privacy. But not so sure about the reliability part.