Moqui Ai with Embeddings for Neural Search

michael · June 14, 2024, 11:09pm

I’ve been playing around with embeddings in open search, and I think it’s a feature that could be added to Moqui. This will make search smarter, and it’ll be the start of Moqui being integrated with some AI stuff.

If people are interested enough in using this, I’ll put in the effort to make it a feature in Moqui. Otherwise I’ll just do my project.

If you’re interested, comment on here how you would use it.

Here’s how it would work:

Details

Open search has a ML pipeline for a way to setup certain fields on certain indexes that when they are uploaded, you use a sentence transformer model with the text to generate an embedding vector:

There would be a DataDocument would fields to store a index.knn / a default_pipeline and DataDocumentField that would have a special type that is not associated with a database field for the knn_vector type.

Then for search, the search string gets sent through a sentence transformer to an embedding vector, then a search between the existing embeddings and the search string embedding.

Then, we may need additional changes to the ElasticFacade.java line 77’s search method. To determine which search method to use.

If there’s enough need for it, additional entities can be added for models and pipelines.

We may also need screens for setting up an model / pipeline.

newmannhu · June 17, 2024, 8:16am

Great features！

m.ashtari · July 3, 2024, 1:54pm

thanks michael

integrines · September 27, 2024, 8:41pm

Hi Michael, Looks interesting, let us connect on this

arun · October 6, 2024, 8:02pm

Great Micheal

Should OpenSearch be run as a separate service outside of the Moqui framework, or is it better to integrate it directly within Moqui using a minimal distribution of OpenSearch?

We are planning to use OpenSearch to integrate a RAG chat system with document files stored in the Jackrabbit repository. Additionally, I don’t know how to load Jackrabbit data into OpenSearch. Please Suggest any recommended approach to solve this using moqui

michael · October 9, 2024, 9:42pm

For production, ideally OpenSearch would be a different service especially if you’re using it for any AI work. A different docker container would be fine to use though.

I’ve done a bit of work on this in this thread.

The shortlist of things to do is:

update opensearch version
read and use opensearch documentation

If you need more help on this reach out to me