-
Notifications
You must be signed in to change notification settings - Fork 29
Feature/no default lora #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @Glaceon-Hyy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request enhances the diffsynth_engine by improving the flexibility of LoRA weight loading in the Qwen-Image pipeline and adding comprehensive documentation for using the Qwen-Image model for image generation. The changes ensure broader compatibility for LoRA models and provide clear guidance for users on how to leverage the Qwen-Image capabilities.
Highlights
- LoRA Weight Loading Improvement: The Qwen-Image pipeline's LoRA weight loading mechanism has been made more robust to support LoRA models that may not include the .default suffix in their weight keys.
- Qwen-Image Documentation: New documentation has been added, in both English and Chinese, detailing how to use the QwenImagePipeline for image generation, including code examples, recommended parameters, and explanations of various control arguments.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for LoRA keys without the .default suffix in the Qwen image pipeline, making it more flexible. The documentation has also been updated to include a tutorial for the Qwen-Image model. My review includes suggestions to improve code conciseness in the pipeline and to fix some formatting and clarity issues in the documentation files.
docs/tutorial.md
Outdated
| * `prompt`: The prompt, used to describe the content of the generated image, It supports multiple languages (Chinese, English, Japanese, etc.), e.g., “一只猫” (Chinese), "a cat" (English), or "庭を走る猫" (Japanese). | ||
| * `negative_prompt`:The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | ||
| * `cfg_scale`:The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | ||
| * `height`: Image height. | ||
| * `width`: Image width. | ||
| * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | ||
| * `seed`: The random seed. A fixed seed ensures reproducible results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting of this parameter list is inconsistent.
- Some parameters use a full-width colon (
:) while others use a standard colon (:) or none at all. Please use a standard colon for all items. - The indentation for list items is inconsistent. Some use
*while others use*. Please use*consistently.
| * `prompt`: The prompt, used to describe the content of the generated image, It supports multiple languages (Chinese, English, Japanese, etc.), e.g., “一只猫” (Chinese), "a cat" (English), or "庭を走る猫" (Japanese). | |
| * `negative_prompt`:The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | |
| * `cfg_scale`:The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | |
| * `height`: Image height. | |
| * `width`: Image width. | |
| * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | |
| * `seed`: The random seed. A fixed seed ensures reproducible results. | |
| * `prompt`: The prompt, used to describe the content of the generated image, It supports multiple languages (Chinese, English, Japanese, etc.), e.g., “一只猫” (Chinese), "a cat" (English), or "庭を走る猫" (Japanese). | |
| * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly". | |
| * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content. | |
| * `height`: Image height. | |
| * `width`: Image width. | |
| * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality. | |
| * `seed`: The random seed. A fixed seed ensures reproducible results. |
docs/tutorial_zh.md
Outdated
| * `prompt`: 提示词,用于描述生成图像的内容,支持多种语言(中文/英文/日文等),例如“一只猫”/"a cat"/"庭を走る猫"。 | ||
| * `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | ||
| * `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | ||
| * `height`:图像高度。 | ||
| * `width`:图像宽度。 | ||
| * `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | ||
| * `seed`:随机种子,固定的随机种子可以使生成的内容固定。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting of this parameter list is inconsistent.
- The parameter
promptuses a standard colon (:) while others use a full-width colon (:). For consistency in a Chinese document, please use the full-width colon for all items. - The spacing after the asterisk is inconsistent. Please use
*for all items.
| * `prompt`: 提示词,用于描述生成图像的内容,支持多种语言(中文/英文/日文等),例如“一只猫”/"a cat"/"庭を走る猫"。 | |
| * `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | |
| * `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | |
| * `height`:图像高度。 | |
| * `width`:图像宽度。 | |
| * `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | |
| * `seed`:随机种子,固定的随机种子可以使生成的内容固定。 | |
| * `prompt`:提示词,用于描述生成图像的内容,支持多种语言(中文/英文/日文等),例如“一只猫”/"a cat"/"庭を走る猫"。 | |
| * `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。 | |
| * `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。 | |
| * `height`:图像高度。 | |
| * `width`:图像宽度。 | |
| * `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。 | |
| * `seed`:随机种子,固定的随机种子可以使生成的内容固定。 |
docs/tutorial.md
Outdated
| ### Image Generation(Qwen-Image) | ||
|
|
||
| The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway. | ||
| The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, cfg_scale, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的cfg_scale, 是个多出来的typo可以删掉
No description provided.