Skip to content

Commit d260547

Browse files
committed
update readme for cache, batch
1 parent 8380372 commit d260547

File tree

1 file changed

+60
-12
lines changed

1 file changed

+60
-12
lines changed

README.md

Lines changed: 60 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
# vector-embedding-api
2-
`vector-embedding-api`provides a Flask API server and client to generate text embeddings using either [OpenAI's embedding model](https://platform.openai.com/docs/guides/embeddings) or the [SentenceTransformers](https://www.sbert.net/) library. The API server also offers an in-memory cache for embeddings and returns results from the cache when available.
2+
`vector-embedding-api`provides a Flask API server and client to generate text embeddings using either [OpenAI's embedding model](https://platform.openai.com/docs/guides/embeddings) or the [SentenceTransformers](https://www.sbert.net/) library. The API server now supports in-memory LRU caching for faster retrievals, batch processing for handling multiple texts at once, and a health status endpoint for monitoring the server status.
33

44
SentenceTransformers supports over 500 models via [HuggingFace Hub](https://huggingface.co/sentence-transformers).
55

66
## Features 🎯
77
* POST endpoint to create text embeddings
88
* sentence_transformers
99
* OpenAI text-embedding-ada-002
10-
* In-memory cache for embeddings
10+
* In-memory LRU cache for quick retrieval of embeddings
11+
* Batch processing to handle multiple texts in a single request
1112
* Easy setup with configuration file
12-
* Simple integration with other applications
13-
* Python client utility for submitting text
13+
* Health status endpoint
14+
* Python client utility for submitting text or files
1415

1516
### Installation 💻
1617
To run this server locally, follow the steps below:
@@ -33,9 +34,8 @@ pip install -r requirements.txt
3334
```
3435

3536
### Usage
36-
Modify the [server.conf](/server.conf) file to specify a SentenceTransformers model, your OpenAI API key, or both.
3737

38-
**Modify the server.conf configuration file:** ⚙️
38+
**Modify the [server.conf](/server.conf) configuration file:** ⚙️
3939
```ini
4040
[main]
4141
openai_api_key = YOUR_OPENAI_API_KEY
@@ -52,19 +52,19 @@ The server should now be running on http://127.0.0.1:5000/.
5252

5353
### API Endpoints 🌐
5454
##### Client Usage
55-
A small [Python client](/client.py) is provided to assist with submitting text strings or text files.
55+
A small [Python client](/client.py) is provided to assist with submitting text strings or files.
5656

5757
**Usage**
5858
`python3 client.py -t "Your text here" -m local`
5959

6060
`python3 client.py -f /path/to/yourfile.txt -m openai`
6161

6262
#### POST /submit
63-
Submits a text string for embedding generation.
63+
Submits an individual text string or a list of text strings for embedding generation.
6464

6565
**Request Parameters**
6666

67-
* **text:** The text string to generate the embedding for. (Required)
67+
* **text:** The text string or list of text strings to generate the embedding for. (Required)
6868
* **model:** Type of model to be used, either local for SentenceTransformer models or openai for OpenAI's model. Default is local.
6969

7070
**Response**
@@ -76,6 +76,31 @@ Submits a text string for embedding generation.
7676
* **cache:** Boolean indicating if the result was retrieved from cache. (Optional)
7777
* **message:** Error message if the status is error. (Optional)
7878

79+
#### GET /health
80+
Checks the server's health status.
81+
82+
**Response**
83+
84+
* **cache.enabled:** Boolean indicating status of the cache
85+
* **cache.max_size:** Maximum cache size
86+
* **cache.size:** Current cache size
87+
* **models.openai:** Boolean indicating if OpenAI embeddings are enabled. (Optional)
88+
* **models.sentence-transformers:** Name of sentence-transformers model in use.
89+
90+
```json
91+
{
92+
"cache": {
93+
"enabled": true,
94+
"max_size": null,
95+
"size": 0
96+
},
97+
"models": {
98+
"openai": true,
99+
"sentence-transformers": "sentence-transformers/all-MiniLM-L6-v2"
100+
}
101+
}
102+
```
103+
79104
#### Example Usage
80105
Send a POST request to the /submit endpoint with JSON payload:
81106

@@ -84,15 +109,38 @@ Send a POST request to the /submit endpoint with JSON payload:
84109
"text": "Your text here",
85110
"model": "local"
86111
}
112+
113+
// multi text submission
114+
{
115+
"text": ["Text1 goes here", "Text2 goes here"],
116+
"model": "openai"
117+
}
87118
```
88119

89120
You'll receive a response containing the embedding and additional information:
90121

91122
```json
92-
{
123+
[
124+
{
93125
"embedding": [...],
94126
"status": "success",
95-
"elapsed": 293.52,
127+
"elapsed": 123,
96128
"model": "sentence-transformers/all-MiniLM-L6-v2"
97-
}
129+
}
130+
]
131+
132+
[
133+
{
134+
"embedding": [...],
135+
"status": "success",
136+
"elapsed": 123,
137+
"model": "openai"
138+
},
139+
{
140+
"embedding": [...],
141+
"status": "success",
142+
"elapsed": 123,
143+
"model": "openai"
144+
},
145+
]
98146
```

0 commit comments

Comments
 (0)