Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 32 additions & 3 deletions python-package/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,26 @@ Currently, querychat uses DuckDB for its SQL engine. It's extremely fast and has

When the querychat UI first appears, you will usually want it to greet the user with some basic instructions. By default, these instructions are auto-generated every time a user arrives; this is slow, wasteful, and unpredictable. Instead, you should create a file called `greeting.md`, and when calling `querychat.init`, pass `greeting=Path("greeting.md").read_text()`.

You can provide suggestions to the user by using the `<span class="suggestion"> </span>` tag.

For example:

```markdown
* **Filter and sort the data:**
* <span class="suggestion">Show only survivors</span>
* <span class="suggestion">Filter to first class passengers under 30</span>
* <span class="suggestion">Sort by fare from highest to lowest</span>

* **Answer questions about the data:**
* <span class="suggestion">What was the survival rate by gender?</span>
* <span class="suggestion">What's the average age of children who survived?</span>
* <span class="suggestion">How many passengers were traveling alone?</span>
```

These suggestions appear in the greeting and automatically populate the chat text box when clicked.
This gives the user a few ideas to explore on their own.
You can see this behavior in our [`querychat template`](https://shiny.posit.co/py/templates/querychat/).

If you need help coming up with a greeting, your own app can help you! Just launch it and paste this into the chat interface:

> Help me create a greeting for your future users. Include some example questions. Format your suggested greeting as Markdown, in a code block.
Expand All @@ -116,10 +136,14 @@ If you give querychat your dataset and nothing else, it will provide the LLM wit

- Column names
- DuckDB data type (integer, float, boolean, datetime, text)
- For text columns with less than 10 unique values, we assume they are categorical variables and include the list of values
- For text columns with less than 10 unique values, we assume they are categorical variables and include the list of values. This threshold is configurable.
- For integer and float columns, we include the range

And that's all the LLM will know about your data. If the column names are usefully descriptive, it may be able to make a surprising amount of sense out of the data. But if your data frame's columns are `x`, `V1`, `value`, etc., then the model will need to be given more background info--just like a human would.
And that's all the LLM will know about your data.
The actual data does not get passed into the LLM.
We calculate these values before we pass the schema information into the LLM.

If the column names are usefully descriptive, it may be able to make a surprising amount of sense out of the data. But if your data frame's columns are `x`, `V1`, `value`, etc., then the model will need to be given more background info--just like a human would.

To provide this information, use the `data_description` argument. For example, if you're using the `titanic` dataset, you might create a `data_description.md` like this:

Expand Down Expand Up @@ -210,4 +234,9 @@ For a complete working example, see the [examples/app.py](examples/app.py) file
- Reading greeting and data description from files
- Setting up the querychat configuration
- Creating a Shiny UI with the chat sidebar
- Displaying the filtered data in the main panel
- Displaying the filtered data in the main panel

If you have Shiny installed, and want to get started right away, you can use our
[querychat template](https://shiny.posit.co/py/templates/querychat/)
or
[sidebot template](https://shiny.posit.co/py/templates/sidebot/).