📏

Choosing the right model and stack elements for your app

Choosing your AI app software:

AI decision-making tree:

Do you need a model for handling private data or discussing controversial/taboo subjects?

Note: leading providers of LLMs are very careful in utilizing private data. In a recent release, OpenAI stated that “As of March 1, 2023, data sent to the OpenAI API will not be used to train or improve OpenAI models”. So you only really need to consider hosting a model locally if privacy is as significant as it would be in the healthcare or defense sectors.

Yes:

➡️
Dolly - one of the largest open-source LLMs
image
➡️
Bloom - A multilingual model which you can demo on HuggingFace spaces (HuggingFace account required).
image
➡️
GPT-NeoX - You can access GTP-NeoX code to create a generator yourself.

Note: there is no ready-made interface on their website to start playing around with these models. You will need an API and a front-end for that. See the next section for recommendations. Use an open-source model. You will need to host these yourself. Online instructions are widely available on how to do so, but it will require some coding work.

No:

Use models offered by OpenAI or similar closed-source company:

➡️
GPT3.5 or GPT4 through OpenAI - State-of-the-art text generators
image

P.S. You can also choose other models by OpenAI in the Open AI Playground - same service but with more options to fine-tune the output

- Playground offers developer options like:

  • Model version: Choose how new and pricey you want the underlying model to be
  • Temperature: Toggles how out-of-the-box the content generated is.
  • Frequency penalty: How much to avoid re-using the same words.
➡️
Claude by Anthropic - model capable of a wide variety of conversational and text processing tasks, as well as an enormous max prompt length ( as of this writing: 75,000 words or ~ 300 pages)
image

Does your AI app need to be fast and cheap?

Yes:

➡️
Curie model - Choose “Curie” among the available models in OpenAI Playground
image
➡️
Cohere - Provides access to advanced Large Language Models and NLP tools through one easy-to-use API (not open source).
image

No:

As before, use models offered by OpenAI, Anthropic, or similar closed-source company.

Hosted APIs

APIs are the venue through which an genAI tool interacts with an AI model (see the glossary below for a more detailed definition). With AI services from closed-source models like ChatGPT, the API is offered only through the company that made the model. But for open-source models there are options for API hosting.

➡️
NLP Cloud - A service for hosting open-source models with API development options
image
➡️
GooseAI - Fully managed NLP-as-a-Service delivered via API
image

Pre-built APIs for hosting your open-source model application:

➡️
HuggingFace - Aforementioned storehouse of open source tools where you can host your own
➡️
Replicate.com - Similar service with slightly different rights and payment options

Cloud GPUs

If you are running an open-source model on your own and not relying on a pre-existing service, you will need the hardware, called GPUs. It is very expensive to buy the hardware, so most small-to-medium providers use cloud GPUs - GPUs working in a remote location and rented out to you via the internet. These are companies which offer cloud GPUs:

➡️
Banana.dev - Offers ability to deploy your own in-house API via serverless GPUs.
image
➡️
Baseten - Serverless backend for building ML-powered applications. Also offers GPU access.
image
➡️
RunPod - Pay-per-second serverless GPU
image
➡️
Cerebrium - A machine learning framework to train, deploy and monitor machine learning models
image