There is little value in paying to host a generic ChatGPT for your brand since it would be the same as all the other ChatGPT demos already on the internet and offer no competitive advantage. Value can only come from using it to support your business processes in such a way that it either lowers your operational cost or drives more sales. For example, supplementing or partially replacing your expensive support services like customer service with an automated service that can scale at a minimal cost. This is the field where chatbots are currently state-of-the-art, and we will explore how and if we can leverage ChatGPT to lower operational cost or increase revenue.
Let’s tackle the simplest solution first. Given how powerful ChatGPT is, why not just replace the chatbot with a specialized ChatGPT instance? This is a bad idea since ChatGTP's responses can be unpredictable or just incorrect; and in ChatGPT’s standard implementation it is impossible to control the responses completely. Considering that this chatbot would be representing your brand, having it produce an incorrect, or offensive, response could do serious reputational damage.
What about using ChatGPT to generate synthetic user input for more training data. Acquisition of actual user data can be time-consuming and expensive and may not cover a lot of the edge cases. Especially while building a chatbot from scratch, this is often a blocking problem. In theory, ChatGPT could just generate mountains of training data by generating variations of an input sentence of plausible customer text.
However, this introduces other issues, as greatly explained by RASA in their article. Augmenting NLU training data this way does not yield significant improvements to the model. Namely, there is a trade-off between faithfulness and variability. We either generate data with low variability that is just a small deviation of the source sentence and thus does not bring new information to the model. Or we generate data with high variability, which deviates too much from the source sentence and is thus unrelated and cannot be used to train the bot. Natural data generation might not be the best use case for now, but there are other NLP tasks required for a chatbot. Simply put, given a user sentence in natural language, a chatbot tries to do the following two things:
1. Understand the user's intent (NLU – Natural Language Understanding)
2. Generate the answer (NLG – Natural Language Generation)
The better a chatbot is at these two tasks the more services they can provide without human agent intervention, thus lowering the workload of your call centers. Any improvement, even small ones automatically scale to all conversations the chatbot handles, so the saving are a function of the number of customer interactions but the operations cost while the operational cost stay low.
The underlying model of ChatGPT (the GPT-3 model) is one of the largest, most powerful language models nowadays, with 175 billion parameters. The model is one of the best when it comes to making a good guess as to what a human would say next, given what was already said. So, for example, given that a story starts with “Once upon a time there was a white rabbit”, what could the following sentence be. Or, more specifically, for chatbots - given the sentence “What does your product do?” the model would generate a sentence or even a paragraph of what it assumes a human would say.
The downside of this comprehensive perspective is its lack of understanding narrowed, domain-specific questions, which is a characteristic of enterprise chatbots.
Given the costly nature of training a state-of-the-art large language model that needs 45 terabytes of text, it is likely that we will see a future where chatbots will simply be built on top of a pre-trained model and then finetuned to fit specific purposes, brands or products. “Finetuning” refers to refining models like ChatGPT to better recognize specific patterns of input and output, for example, sentences and words related to your company’s field. It is possible to finetune the large model with our own custom dataset so that the NLU and NLG fit our particular use cases. In this case, we still leverage the advantages of using a large model but with the finetuned touch of constraints of the NLU and NLG. Potential use cases will be discussed in a later paragraph.
This would allow companies to use a state-of-the-art chatbot that has the brand values and tone unique to your company while not having to incur the cost of creating the model from scratch.
To fine tune the ChatGPT model specifically, you would have to agree to allow OpenAI the possibility to use your data to improve their services. Data privacy laws in most countries would require highly restrictive contracts with third parties to allow the sharing of customer PII data.
With today’s tools for anonymization, and some legislation that limits the liability of providing this type of data to these services, it is possible to legally use these services in most cases.
The alternative is on-site fine tuning, but then we are limited to either open-source models (e.g. BLOOM), or the few smaller companies that are willing to let you host their model.
Large language models (LLM) will eventually revolutionize the solutions we currently use chatbots for. Here we'll cover how you will know when to reevaluate this technology. When solutions where ChatGPT-like systems only do the interpretation of the customer's need and the actual response is handled by a stricter system like a chatbot, it would be time to revisit how this technology can be used in your organization.
Currently, there is no proven way to provide ChatGPT with enough context to expect its output to be relevant and ensure it has the correct tone of speech specific to your brand or company. Solutions will be developed that provide the LLMs with the company specific context, thereby enabling generating of relevant training data for chatbots. At this point it would be time to reevaluate this approach. This innovation might take the form of a work instruction on how to explain to a ChatGPT like solution the company specific information, such as type of products and the unique differentiation from competitors. That way the model would produce training data specifically for you company.
Another development to look out for is simplified integration with enterprises. Some use cases might already be possible today, technology wise, but are not easily implemented given the lack of integration possibilities LLMs offer. Microsoft has recently announced it is planning to provide simplified integration options in the near future.
Currently enterprise natural language understanding technology struggles to understand long texts with context, human emotions and nuances, for example noticing a sense of urgency in a user query is missing from today's chatbots. LLMs like ChatGPT are the cutting edge of the efforts to solve these problems.
Development of the NLG capabilities of ChatGPT offer great potential. Unfortunately, as mentioned before, only after LLMs like ChatGPT can improve their NLG capability to generate useful training data, it can tremendously reduce the time and costs of implementing conversational AI from scratch.
Looking at the crossroad of simplified integrations and NLU developments, this might enable a chatbot where most of the user queries are handled by the current chatbot, and the often long, hard-to-classify queries are sent to an LLM and sent back with the respective classification.
Focusing on NLG, this could enable a future where ChatGPT could be used to interpret figures or numbers and generate concise reports in natural language.
ChatGPT and other LLM do offer useful applications today, even if tasks like generating proper training data are a bridge too far for now. However, the current level of the NLG capabilities of these models enables us to utilize them to generate translations of an input sentence already. Not simply translation from one spoken language to another, but also from a description of a function in English to actual working code in Python or from a complex legal text to simple language.
Moreover, the prompts generated by the model cannot be seen as an interaction with a real user, but the model can be used for certain tasks in a similar fashion. We can use ChatGPT to act as a user in order to evaluate experiments. Given its cheap nature, we can easily scale and run various experiments on, for example, how different dialogues and hyperparameters in our bot influence the overall performance. In that way we can try to optimize the classifications and routes, thereby ensuring that a customer can find the correct information or perform the correct action, while limiting the need for a human agent handover. Another use case is to use this same user simulation approach as an additional step in our quality assurance process and find mistakes by already letting it converse with a ‘user’.
An enterprise chatbot often has a major FAQ component – where the question is always answered with static information that can be extracted from the website. By providing this information as information documents, ChatGPT can be used and finetuned using this data and can thus serve as a company-specific FAQ bot. These models could also speed up the creation of the text for the answer to new FAQ entries by providing a first draft that is refined and validated by experts.