|
|
|
To facilitate the use of Large Language Models we now have an API service which deploys models that individual students cannot run on their allocated compute budgets.
|
|
|
|
|
|
|
|
**Usage**
|
|
|
|
|
|
|
|
* POST requests to the API are to be JSON-encoded strings requiring a text key and allowing for the flexible addition of any other keyword arguments that can be added to the vLLM sampling params found [here](https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py)
|
|
|
|
* Additionally output can be streamed with the boolean kwarg 'stream' and logits can be returned using the boolean kwarg 'include_logits'
|
|
|
|
* For completion tasks you can prepend the prompt used with the boolean kwarg 'include_prompt'
|
|
|
|
* A convenience function as well as default sampling configurations can be found in [this repository](https://github.com/Parry-Parry/idaLLM/tree/main/idallm)
|
|
|
|
* Using the request function of the idaLLM package you can easily send single or batched prompts
|
|
|
|
* The package [LightChain](https://github.com/Parry-Parry/LightChain/tree/main) adds further functionality for prompt chaining and easy objects for dialogue memory and complex prompts
|
|
|
|
|
|
|
|
**Current Models**
|
|
|
|
|
|
|
|
Any models currently being served will be noted here, along with their corresponding URLs, notably all models should be queried at the end point /generate, documentation for the API can be found at /docs for any model
|
|
|
|
|
|
|
|
| Model | URL |
|
|
|
|
| ------ | ------ |
|
|
|
|
| Llama 2 (7B) | http://llama2api-ir.ida.dcs.gla.ac.uk/ |
|
|
|
|
| | | |
|
|
|
\ No newline at end of file |