Skip to main content

How to Format the Output of LLM Using Langchain on Jetson


In modern families, the trend towards smart homes is becoming increasingly common. You can transform your home into a smarter and more convenient living space by connecting smart devices, automation, and artificial intelligence technology. To this end, we plan to integrate LLM into HomeAssistant to create a more intelligent home assistant. The first step to achieve this goal is to enable LLM to output control signals that can manage smart home devices.

In this wiki, you will learn how to use Langchain to format the output of large language models and deploy it on edge computing devices. Specifically, use the powerful understanding capability of large language models to build a local chatbot, and then utilize Langchain tools to fix the output format of the large model.

What is LLM?

Large Language Models (LLM) are a type of artificial intelligence model based on deep learning, such as GPT, which excel in natural language processing tasks. They are capable of understanding and generating text, and are therefore widely used in various applications such as text generation, translation, question-answer systems, and more.

Why is it necessary to format the output of LLM?

Formatting the output of a Large Language Model (LLM) is important for making it more comprehensible and tailored to specific applications. Often, the text generated by an LLM may include redundant information, unnecessary details, or content that is inconsistently formatted. By formatting the output, you can ensure that the text meets specific standards, remove unwanted parts, and ensure that it aligns with the requirements of your application. This process is crucial for integrating LLM outputs effectively into various systems and applications, ensuring that the generated content is both relevant and useful.

How to format the output of LLM?

Here, we can utilize a very user-friendly tool—Langchain. It is a powerful framework designed to assist developers in building end-to-end applications using language models. It offers a set of tools, components, and interfaces that simplify the process of creating applications supported by Large Language Models and chat models.

How to deploy in edge devices?

Step 1. You will need to prepare a Jetson device (reComputer J4012) equipped with the Jetpack 5.0+ operating system.

Step 2. Open the terminal and install the corresponding dependency software.

pip3 install --no-cache-dir --verbose langchain[llm] openai
pip3 install --no-cache-dir --verbose gradio==3.38.0

Step 3. Create a new Python script, named, and insert the following code into it.
import copy

import gradio as gr
from langchain.llms import LlamaCpp
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
import os
os.environ["OPENAI_API_KEY"] = "your openai api key"

class ChatBot:
def __init__(self, llama_model_path=None,history_length=3):
self.chat_history = []
self.history_threshold = history_length
self.llm = None
if llama_model_path is not None:
self.llm = LlamaCpp(
self.llm = OpenAI(model_name="text-davinci-003")

response_schemas = [
ResponseSchema(name="user_input", description="This is the user's input"),
ResponseSchema(name="suggestion", type="string", description="your suggestion"),
ResponseSchema(name="control", description="This is your response"),
ResponseSchema(name="temperature", type="int", description="This is used to indicate the degrees "
"centigrade temperature of the air conditioner.")
self.output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
self.format_instructions = self.output_parser.get_format_instructions()

self.template = """
Now you are a smart speaker, and you need to determine whether to turn on the air conditioner based on the
user's input.
In the suggestion section, please reply normal conversation.
In the control section, if you need to turn on the air conditioner, please reply with <1>; if you need to
turn off the air conditioner, please reply with <0>.


Please do not generate any comments.


self.prompt = PromptTemplate(
partial_variables={"format_instructions": self.format_instructions},

def format_chat_prompt(self, message):
prompt = ""
for turn in self.chat_history:
user_message, bot_message = turn
prompt = f"{prompt}\nUser: {user_message}\nAssistant: {bot_message}"
prompt = f"{prompt}\nUser: {message}\nAssistant:"
return prompt

def respond(self, message):
prompt = self.prompt.format(user_input=message)
formatted_prompt = self.format_chat_prompt(prompt)
bot_message = self.llm(formatted_prompt)
# self.output_parser.parse(bot_message)

if len(self.chat_history) >= self.history_threshold:
del self.chat_history[0]
self.chat_history.append((message, bot_message))
return "", self.chat_history

def run_webui(self):
with gr.Blocks() as demo:
gr.Markdown("# This is a demo for format output of LLM")
chatbot = gr.Chatbot(height=500)
msg = gr.Textbox(label="Prompt")
btn = gr.Button("Submit")
clear = gr.ClearButton(components=[msg, chatbot], value="Clear console"), inputs=[msg], outputs=[msg, chatbot])
msg.submit(self.respond, inputs=[msg], outputs=[msg, chatbot])


if __name__ == '__main__':
chatbot_ins = ChatBot("/home/nvidia/Mirror/llama-2-7b-chat.Q4_0.gguf")

Step 4. Enter python3 in the terminal to start the script, and then visit in a browser to access the WebUI and test the effects.

Next Steps Plan?

  • Integrate with Nvidia Riva to replace text input with voice interaction.
  • Connect to Home Assistant to control smart home devices using the output from LLM.

Tech Support & Product Discussion

Thank you for choosing our products! We are here to provide you with different support to ensure that your experience with our products is as smooth as possible. We offer several communication channels to cater to different preferences and needs.

Loading Comments...