How to make LLMs smarter

Today, the HR of our company made a sharing about "Making Better Decisions Through Cognitive Psychology," which has many similarities to the "Business Analysis and Decision Making" course I attended with Professor Wu at NUS.

When I previously researched how to make LLMs smarter, I also discovered similar approaches.

Large language models are probabilistic models. If you let it answer questions quickly, it will adopt System 1 (fast thinking).

There are multiple ways to make existing large language models (LLMs) smarter, which can be roughly divided into four categories:

1. Ask better questions

The training process of LLMs uses a vast amount of data, far exceeding the books we read and the content we find online.

Therefore, asking good questions can help better leverage the potential of LLMs. There are many books and websites that provide guidance on how to construct question prompts (Prompt), and the basic logic is as follows:

Elements	Name	Description
Task	Task	The type of content that GPT needs to generate
Instruction	Instructions	Principles to follow when generating content
Character	Role	The role that GPT needs to play
Keywords	Seed-word	Points to emphasize

For example:

Write a WeChat official account article introducing how to make LLMs smarter.

To

Senior WeChat Official Account operator

"System One, System Two"

Using Few-Shot combined with Chain-of-Thought (CoT)

Directly asking LLM "What is 1364 multiplied by 2343?" may result in a wrong answer.

However, we can improve this process by modifying the prompt, for example:

Then add up the results: Now, according to the above method, calculate what 1364 multiplied by 2343 equals.

This time the answer is correct:

This process is similar to a decision-making flow. Making decisions based solely on one's own thoughts may not be sufficient; such a decision-making process can be used, similar to how LLMs improve the decision-making process by referencing CoT through Few-Shot.

2. Fine-Tuning

I once listened to a podcast that mentioned if you enhance the imitation ability in one aspect during the fine-tuning process, the abilities in other areas might weaken. However, this method can make the LLM smarter in the direction we hope for.

It should be clear that Fine-Tuning is different from Few-Shot mentioned earlier. Fine-Tuning changes the model itself, whereas Few-Shot does not alter the model structure.

this changes his brain model compared to just giving examples.

This is similar to using a decision-making model. You can guide and train the brain by choosing an appropriate direction based on your preferences. After getting used to rational decision-making, it may become less easy to make emotional decisions. Therefore, it's important to choose the most suitable method or the most appropriate one for a specific situation.

3. RAG（Retrieval Augmented Generation）

RAG (Retrieval Augmented Generation, retrieval enhanced generation) is a technology where large language models (LLMs) retrieve relevant information from a large number of documents before answering questions or generating text, and then answer or generate text based on the retrieved information. This can improve the quality of the response rather than relying solely on the capabilities of the LLM itself. RAG technology allows developers to attach relevant knowledge bases without retraining the entire large model for each specific task, providing additional information input for the model and thus improving the accuracy of the responses. The workflow of RAG technology is shown in the figure below.

In my view, Claude2 does a pretty good job with RAG.

Some others that perform well include Perplexity, txyz, ChatGPT, chatpdf, etc.

This process is somewhat similar to how we make decisions: when we know we are not strong in a certain area, we can seek targeted consultations from professionals in that field. By leveraging their professional expertise, we can make wiser decisions. For example, in the case of the Thai Sleeping Beauty Cave rescue, data from different experts was RAGed.

4. Function calls and tool usage.

The GPT last launch announced the Function Call feature, which greatly expanded the boundaries of what LLMs can do.

The latest version of GPT has become smarter, mainly by incorporating the previous code interpreter as an analytical function, allowing it to call Python when handling complex problems.

Before this feature appeared, the earliest related paper I saw was "Program-Aided Language (PAL) Models," where such models use large language models to understand questions and leverage Python for calculations.

PAL was later also integrated into the open-source framework Langchain.

Langchain itself also developed the Math chain feature.

Of course, OpenAI also provides various plugins, such as a calculator.

This process of calling functions and using tools is similar to the methods we use when making decisions: we may be aware of our limited abilities, but we can use various tools to assist in decision-making. By using these tools, we can solve problems more effectively and make better decisions.

Final Remarks

Just as we seek information, consult experts, or use various tools when making decisions in daily life, the development of large language models (LLMs) also reflects this diversified and comprehensive thinking pattern. Whether by asking better questions, fine-tuning models, using retrieval-augmented generation (RAG) technology, or learning to effectively call functions and use tools, these strategies all emphasize the importance of flexibly utilizing resources and technology when facing complex problems. I hope that the colleagues in our company can learn to use LLMs better, making LLMs smarter; and also