Forum OpenACS Development: Re: Semantic Search in OpenACS

Collapse
Posted by Neophytos Demetriou on
Added pgembedding-driver that uses pg_embedding PostgreSQL extension to my openacs-packages repository (see above). It may be faster than pgvector but it seems to me that pgvector is more robust at the moment. I had to go through some hoops to get it to compile on my system.

Anyway, the two vector similarity packages I provided together with tbert are put out there as proof of concept that TCL/OpenACS has these capabilities now.

Here's my plan for the next couple of days:

1. Resolve the issue with the current version of tsearch2-driver (see above - ranking is lost - search results are NOT in order). Most likely I'll provide a drop in replacement for it i.e. pgfts-driver.

2. Polish tbert, pgvector-driver, and pgembedding-driver. For example, cmake does not seem to be the preferred choice for NaviServer modules. I'll sort it out. Just wanted to have all dependencies installed from the same git repo. Furthermore, I need to add the index for the pgvector-driver package and add permission checking.

3. Ideally, the vector similarity packages should be used to provide additional search results (similar to your search kind of thing). This would require some changes to the search package. They are easy to make but they are out of scope for now.

4. I won't do solr and faiss unless someone really needs them. Both solr and faiss have the downside that they won't have the acs objects. So, searching will slow things.

PS. If there is no interest from the community about vector similarity search, I might as well turn my attention to a two-factor authentication solution for OpenACS. It would require a C-based module (maybe two naviserver modules) as well but I have that under control.

Collapse
Posted by Neophytos Demetriou on
So, searching will slow things.

I meant retrieving the search results (acs objects) from the db after solr and faiss responded.

Collapse
Posted by Neophytos Demetriou on
For example, cmake does not seem to be the preferred choice for NaviServer modules. I'll sort it out. Just wanted to have all dependencies installed from the same git repo.

This is done (added NaviServer module Makefile - thanks Gustaf) plus I've made the cmake installation more robust for TCL installation. If you have trouble installing tbert on your system please do not hesitate and contact me via email or here. I'm using Ubuntu Linux and so are the docker images.