Update Python README to match R#11
Conversation
| ### Provide a greeting (recommended) | ||
|
|
||
| When the querychat UI first appears, you will usually want it to greet the user with some basic instructions. By default, these instructions are auto-generated every time a user arrives; this is slow, wasteful, and unpredictable. Instead, you should create a file called `greeting.md`, and when calling `querychat.init`, pass `greeting=Path("greeting.md").read_text()`. | ||
|
|
There was a problem hiding this comment.
You can provide suggestions to the user by using the <span class="suggestion"> </span> tag.
For example:
* **Filter and sort the data:**
* <span class="suggestion">Show only survivors</span>
* <span class="suggestion">Filter to first class passengers under 30</span>
* <span class="suggestion">Sort by fare from highest to lowest</span>
* **Answer questions about the data:**
* <span class="suggestion">What was the survival rate by gender?</span>
* <span class="suggestion">What's the average age of children who survived?</span>
* <span class="suggestion">How many passengers were traveling alone?</span>This will provide a clickable message in the greeting that will auto populate the chat text box. This gives the user a few ideas to explore on their own. You can see this behavior in our querychat template.
There was a problem hiding this comment.
The <span class="suggestion"> is a good example to point out for the greeting. It really makes the greeting flow better, and we should showcase how to do it. It really polishes up the application and chat.
This was a suggestion of the block of text to put in this part of the README
|
|
||
| - Column names | ||
| - DuckDB data type (integer, float, boolean, datetime, text) | ||
| - For text columns with less than 10 unique values, we assume they are categorical variables and include the list of values |
There was a problem hiding this comment.
- For text columns with less than 10 unique values, we assume they are categorical variables and include the list of values. This threshold is configurable.
| - For text columns with less than 10 unique values, we assume they are categorical variables and include the list of values | ||
| - For integer and float columns, we include the range | ||
|
|
||
| And that's all the LLM will know about your data. If the column names are usefully descriptive, it may be able to make a surprising amount of sense out of the data. But if your data frame's columns are `x`, `V1`, `value`, etc., then the model will need to be given more background info--just like a human would. |
There was a problem hiding this comment.
Might be important here to really clarify that the data does not go to the LLM.
And that's all the LLM will know about your data. The actual data does not get passed into the LLM. We calculate these values before we pass the schema information into the LLM.
| ### Use a different LLM provider | ||
|
|
||
| You can change the LLM provider or model by providing a callback function that takes a `system_prompt` argument and returns a `chatlas.Chat` object. (The parameter must be called `system_prompt` as it is passed by keyword.) | ||
| By default, querychat uses GPT-4o via the OpenAI API. If you want to use a different model, you can provide a `create_chat_callback` function that takes a `system_prompt` parameter, and returns a chatlas Chat object: | ||
|
|
||
| ```python | ||
| import chatlas | ||
| from functools import partial | ||
|
|
||
| # Option 1: Define a function | ||
| def my_chat_func(system_prompt: str) -> chatlas.Chat: | ||
| return chatlas.ChatAnthropic( | ||
| model="claude-3-5-sonnet-latest", | ||
| model="claude-3-7-sonnet-latest", | ||
| system_prompt=system_prompt | ||
| ) | ||
|
|
||
| # Option 2: Use partial | ||
| my_chat_func = partial(chatlas.ChatAnthropic, model="claude-3-7-sonnet-latest") | ||
|
|
||
| querychat_config = querychat.init( | ||
| df=my_dataframe, | ||
| df=titanic, | ||
| table_name="titanic", | ||
| create_chat_callback=my_chat_func | ||
| ) | ||
| ``` |
There was a problem hiding this comment.
It's really good you have these examples. Shiny prides itself in not having to write a callback, so seeing that term can be a bit scary. I suspect most people will use Option 1.
| - Reading greeting and data description from files | ||
| - Setting up the querychat configuration | ||
| - Creating a Shiny UI with the chat sidebar | ||
| - Displaying the filtered data in the main panel No newline at end of file |
There was a problem hiding this comment.
Add at the end.
If you have Shiny installed, and want to get started right away, you can use our querrychat template.
|
Merge #12 into this before merging into |
sync up R Readme
add a few more clarifications in sections
* main: docs: Update Python and R READMEs (#11)
No description provided.