Skip to content

Commit fef017f

Browse files
authored
Update GPT integration for image recognition.md (#418)
1 parent faf0051 commit fef017f

1 file changed

Lines changed: 74 additions & 22 deletions

File tree

docs/5. Integrations/GPT integration for image recognition.md

Lines changed: 74 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,84 @@
1-
# GPT integration for image recognition
2-
Recognize the contents in an image being sent by your users. Some example usecases can be to grade worksheets, digitse the inputs written on pages on the go, perform analysis of condition of skin, water quality, wounds, environment etc.
31

4-
## How to use in a flow
2+
<h3>
3+
<table>
4+
<tr>
5+
<td><b>5 minutes read</b></td>
6+
<td style={{ paddingLeft: '40px' }}><b>Level: Advanced</b></td>
7+
<td style={{ paddingLeft: '40px' }}><b>Last Updated: October 2025</b></td>
8+
</tr>
9+
</table>
10+
</h3>
511

6-
1. Get the image input from the user
7-
2. Go to call a webhoook node
8-
3. In the type of webhook call select `function`
9-
4. Give the webhoook an appropriate result name. In the exmaple shown the webhook result name is `gptvision`
10-
5. Enter `parse_via_gpt_vision`
1112

12-
<img width="527" alt="Screenshot 2024-05-07 at 6 20 20 PM" src="https://github.com/glific/docs/assets/141305477/0741e6a0-1e8e-4b30-a246-e6ff454d90c6"/>
13+
# GPT Integration for Image Recognition
14+
15+
This feature helps to recognize and understand the contents of images shared by the end users. It is useful for a variety of use cases such as:
16+
17+
- Grading worksheets or assignments
18+
- Reading and digitizing handwritten notes
19+
- Checking skin conditions, wounds, or other health-related visuals
20+
- Reviewing the condition of water, crops, or surroundings
21+
22+
With GPT’s image processing abilities, Glific makes it easy to automate tasks that involve looking at and understanding images.
23+
24+
25+
**To get the best results:**
26+
- Use clear and well-lit images
27+
- Avoid blurry or low-quality photos
28+
- Make sure the main content is visible and not blocked
29+
30+
---
31+
32+
33+
## How to Use in a Flow
34+
35+
#### Step1: Collect Image Input from the end user
36+
- Create a `send a message` node directing users to send images as their response, based on their preference.
37+
- In the `Wait for response` node, select `Has image` as the message response type. Also, give a `Result Name`. In the screenshot below, `image` is used as the result name.
38+
39+
<img width="500" height="383" alt="Screenshot 2025-09-29 at 1 29 38 AM" src="https://github.com/user-attachments/assets/4d685fb3-b126-4490-bb92-23fd9f30ed0b" />
40+
41+
42+
#### Step 2: Add a Webhook Node to Process the Image
43+
- Select the `function` from the dropdown.
44+
- In the function field, enter `parse_via_gpt_vision`, this function name is pre-defined.
45+
- Give the webhook result name - you can use any name. In the screenshot example, it’s named `gptvision`.
46+
47+
<img width="493" height="408" alt="Screenshot 2025-09-29 at 1 46 28 AM" src="https://github.com/user-attachments/assets/1162f959-f923-4f5d-9214-3337456e8bda" />
48+
49+
---
50+
51+
<img width="504" height="553" alt="Screenshot 2025-09-29 at 1 46 58 AM" src="https://github.com/user-attachments/assets/b398e16c-5a6b-48ac-8c16-680c30728690" />
52+
53+
54+
#### Step 3: Add Parameters in Function Body
55+
- Click on `Function Body` and pass the parameters as shown in the below screenshot.
56+
57+
<img width="493" height="343" alt="Screenshot 2025-09-29 at 1 51 22 AM" src="https://github.com/user-attachments/assets/cfafbb40-04da-4321-9cf8-d457ff08835e" />
58+
59+
60+
- `Url`: in this field pass the flow variable accepting the image response from the user (In the given example `image` is the result name).
61+
- `prompt`: in this field pass the prompt, or instructions you wish to convey to the AI model towards processing the image input.
62+
- `Model`: in this field pass the gpt model.
63+
64+
65+
#### Step 4: Display the text response
66+
- Create a `Send Message` node.
67+
- Use `@results.webhook_result_name.response` to show the text response (In the given example `gptvision` is the webhook result name).
68+
69+
70+
<img width="494" height="378" alt="Screenshot 2025-09-29 at 1 54 03 AM" src="https://github.com/user-attachments/assets/fa53b6af-a5f2-41cc-b058-91a6991ed611" />
71+
72+
73+
---
74+
75+
## Sample Flow
76+
Try this [Sample Flow](https://drive.google.com/file/d/1Czi9Bh7YIWGeZwpveXPu-xTGMml-h_R9/view?usp=sharing) to test the GPT Vision integration.
77+
1378

14-
15-
7. provide the following params in the function body
1679

17-
`"url":"",` in this field pass the flow variable accepting the image response from the user
18-
19-
`"prompt":""`in this field pass the prompt, or instructions you wish to convey to the AI model towards processing the image input
20-
21-
<img width="529" alt="Screenshot 2024-05-07 at 6 17 37 PM" src="https://github.com/glific/docs/assets/141305477/5d5e202c-dd9c-4569-94da-ab53d535d749"/>
2280

23-
7. To use the `gpt-4o`provide additional parameter called `model`
24-
<img width="631" alt="Screenshot 2024-05-22 at 3 33 37 PM" src="https://github.com/glific/docs/assets/141305477/ad5ea9fc-727b-43c5-ad4c-a11521542177" />
2581

26-
27-
10. Use the variable `@results.webhookname.response` to print the response from the AI model. In the above example the response will be printed by calling `@results.gptvision.response`
28-
<img width="579" alt="Screenshot 2024-05-22 at 3 34 06 PM" src="https://github.com/glific/docs/assets/141305477/b16d4708-b18e-40ff-8e53-57adc50c92c2" />
2982

3083

3184

32-
Here is an exmaple flow for you to import and try it out: https://drive.google.com/file/d/1p-L5bPnXtsZZKBA3_pJr5wsdeQm3taVD/view?usp=sharing

0 commit comments

Comments
 (0)