Skip to content

Comments

Extract inline images into attachments for plain rich text messages#3256

Merged
tomholub merged 17 commits intomasterfrom
issue-2460-images-inside-plain-rich-text
Dec 31, 2020
Merged

Extract inline images into attachments for plain rich text messages#3256
tomholub merged 17 commits intomasterfrom
issue-2460-images-inside-plain-rich-text

Conversation

@limonte
Copy link

@limonte limonte commented Dec 16, 2020

@limonte

This comment has been minimized.

@tomholub

This comment has been minimized.

@tomholub tomholub closed this Dec 16, 2020
@tomholub tomholub reopened this Dec 16, 2020
@tomholub

This comment has been minimized.

@limonte

This comment has been minimized.

@limonte

This comment has been minimized.

@limonte
Copy link
Author

limonte commented Dec 17, 2020

@tomholub it works now, thanks! Sometimes a good question is better than the advice :)

I'll continue working on figuring out why some tests are failing as well as writing a test for this fix.

Question: in extractInlineImagesToAttachments() I'm iterating and editing images with DOMPurify. Is that the best way to perform string -> nodes -> edit nodes -> back to string?

@limonte
Copy link
Author

limonte commented Dec 17, 2020

Another question: I noticed there was Att -> Attachment renaming recently.

There are still some /Att[A-Z']/ occurrences and I'm using Mime.createAttNode()

image

I guess we should also do /Att[A-Z']/ -> Attachment$1, if so I'd prefer it to be done in separate PR not to mix the fix and the refactoring in one PR.

@tomholub
Copy link
Collaborator

tomholub commented Dec 17, 2020 via email

Copy link
Collaborator

@tomholub tomholub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good :)

return node;
}

private static extractInlineImagesToAttachments = (html: string, rootNode: any) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in high-level overview 👍 but will need to give this more scrutiny later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seisvelas should scrutinize it for attack vectors if any

@limonte limonte force-pushed the issue-2460-images-inside-plain-rich-text branch from 41a3be6 to f896c36 Compare December 19, 2020 01:08
@limonte limonte marked this pull request as ready for review December 19, 2020 02:03
@limonte limonte changed the title [WIP] Extract inline images into attachments for plain rich text messages Extract inline images into attachments for plain rich text messages Dec 19, 2020
@limonte
Copy link
Author

limonte commented Dec 19, 2020

@tomholub do you maybe have any quick guess why the test is failing, there's no <img> tag in getMessageBySubject().payload.body.data. I checked the sending, the HTML is correct (with <img> tag) right before sending.

@tomholub
Copy link
Collaborator

@tomholub do you maybe have any quick guess why the test is failing, there's no <img> tag in getMessageBySubject().payload.body.data. I checked the sending, the HTML is correct (with <img> tag) right before sending.

Sorry haven't been able to have a look yet, can't think of anything without investigating

Except if the img got sanitized away somewhere using Xss class

@limonte
Copy link
Author

limonte commented Dec 19, 2020

@tomholub do you maybe have any quick guess why the test is failing, there's no <img> tag in getMessageBySubject().payload.body.data. I checked the sending, the HTML is correct (with <img> tag) right before sending.

Sorry haven't been able to have a look yet, can't think of anything without investigating

Except if the img got sanitized away somewhere using Xss class

Thanks, I'll investigate further.

@limonte
Copy link
Author

limonte commented Dec 20, 2020

@tomholub I need your help with writing a test. There's <img src="data:image/png;base64..."> instead of <img src="cid:...">. I was tracking it down to this place, mimeMsg is as expected here, with extracted inline images.

I guess something is happening mocked google endpoints, but I can't figure out what and where. Let me know if you have time for helping me with this, if not, then I'll continue debugging it by myself.

@limonte limonte changed the title Extract inline images into attachments for plain rich text messages [WIP] Extract inline images into attachments for plain rich text messages Dec 23, 2020
@limonte limonte marked this pull request as draft December 23, 2020 17:15
@tomholub
Copy link
Collaborator

Sorry, I only do light work this week so may take a bit of time before I get to this with meaningful help.

@rrrooommmaaa - if you have the time - could you look into this?

@limonte
Copy link
Author

limonte commented Dec 23, 2020

Sorry, I only do light work this week so may take a bit of time before I get to this with meaningful help.

@rrrooommmaaa - if you have the time - could you look into this?

Thanks, I realized that from your GitHub activity :) @rrrooommmaaa no need for help atm, I think I'll finish this by myself. I'll let you know additionally if the help will be needed here.

@limonte
Copy link
Author

limonte commented Dec 23, 2020

@tomholub I need your help with writing a test. There's <img src="data:image/png;base64..."> instead of <img src="cid:...">. I was tracking it down to this place, mimeMsg is as expected here, with extracted inline images.

Found the reason:

simpleParser is the easiest way to parse emails. You only need to provide a message source to get a parsed email structure in return. As an additional bonus all embedded images in HTML (eg. the images that point to attachments using cid: URIs) are replaced with base64 encoded data URIs, so the message can be displayed without any additional processing

https://nodemailer.com/extras/mailparser/

@limonte limonte marked this pull request as ready for review December 23, 2020 23:51
@limonte limonte changed the title [WIP] Extract inline images into attachments for plain rich text messages Extract inline images into attachments for plain rich text messages Dec 23, 2020
@limonte
Copy link
Author

limonte commented Dec 23, 2020

@tomholub the PR is ready for review 🚀

@seisvelas please evaluate the security of the extractInlineImagesToAttachments() method

Thanks!

Copy link
Collaborator

@tomholub tomholub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It seems that FlowCrypt itself cannot render such messages yet? We may want to look into that sometime, when using Settings -> Inbox.

expect(sentMsg.payload?.body?.data).to.match(/<img src="cid:(.+)@flowcrypt">This is an automated puppeteer test: Test Sending Plain Message With Image/);
return;
// todo - this test case is a stop-gap. We need to implement rendering of such messages below,
// then let test plain messages with images in them (referenced by cid) just like other types of messages below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make an issue for this

}
if (choices.richtext && !choices.encrypt && !choices.sign && msg.body['text/html']) {
// extract inline images of plain rich-text messages (#3256)
// todo - also apply to rich text signed-only messages
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also todo here for later

@tomholub tomholub merged commit 727b774 into master Dec 31, 2020
@tomholub tomholub deleted the issue-2460-images-inside-plain-rich-text branch December 31, 2020 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

format images of signed or plain rich text messages

2 participants