Skip to content

Add ReaderFrom/WriterTo for Linux (splice)#29

Open
cpuguy83 wants to merge 1 commit intocontainerd:mainfrom
cpuguy83:add_readerfrom_writerto
Open

Add ReaderFrom/WriterTo for Linux (splice)#29
cpuguy83 wants to merge 1 commit intocontainerd:mainfrom
cpuguy83:add_readerfrom_writerto

Conversation

@cpuguy83
Copy link
Copy Markdown
Member

On Linux, we can use splice to optimize copies to happen in kernel
when copying to other files.
Since this is already a pipe, this should always work when copying
to/from another file.

@cpuguy83 cpuguy83 changed the title Add ReaderFrom/WriterTo for Linux Add ReaderFrom/WriterTo for Linux (splice) Nov 21, 2020
@cpuguy83 cpuguy83 force-pushed the add_readerfrom_writerto branch 5 times, most recently from 0a81acb to abf9d52 Compare November 21, 2020 06:32
// splice not supported on kernel
atomic.StoreInt32(&spliceSupported, 0)
return true
case syscall.EINVAL, syscall.EOPNOTSUPP, syscall.EPERM:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would syscall.EOPNOTSUPP here indicate the process is unable to perform the syscall regardless of whether the kernel supports it? If this doesn't depend on input would it make sense to treat it like ENOSYS?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think EOPNOTSUP is more about the transport than the actual system.
For instance, I'd expect EOPNOTSUP if the read side is a unix socket which does not support splicing... but I have not tested this... it may actually be worth adding tests for this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the specific case may actually be EINVAL...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, its not super clear when these standard return values would be returned in this case and better not to set a global value

@cpuguy83 cpuguy83 force-pushed the add_readerfrom_writerto branch 2 times, most recently from 6e8e675 to 2e799de Compare July 30, 2021 00:39
On Linux, we can use `splice` to optimize copies to happen in kernel
when copying to other files.
Since this is already a pipe, this should always work when copying
to/from another file.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@cpuguy83 cpuguy83 force-pushed the add_readerfrom_writerto branch from 2e799de to 19b9833 Compare July 30, 2021 17:21
@cpuguy83
Copy link
Copy Markdown
Member Author

Updated this, added some new test cases and improved the implementation slightly:

  1. EINTR handling was breaking out of the loop instead of just retrying - fixed
  2. Before the max amount that could be copied before the new methods would return was 1<<62... which is a lot... but is not expected since we want to copy until EOF (splice copies 0 bytes), so the internal copy takes a special value, -1, to mean copy to EOF. This is important because we have special handling for when the reader passed in to ReadFrom is an *io.LimitedReader and this was originally mixing "amount to copy" with "amount remaining to copy".

@cpuguy83
Copy link
Copy Markdown
Member Author

I'm going to do some more testing on this as well.

@cpuguy83
Copy link
Copy Markdown
Member Author

FYI, ended up making a new repo to play around with this a bit more, benchmarks, etc: https://github.com/cpuguy83/pipes
Somewhat suprisingly I'm seeing an approx 2x speedup using splice.

I've made some changes to the implementation that make it a bit cleaner which I'll move here as well.

@thaJeztah
Copy link
Copy Markdown
Member

Somewhat suprisingly I'm seeing an approx 2x speedup using splice.

Silly question; the benchmark in the readme doesn't show a delta; is that because some option wasn't set?

(Performance improvement sounds great though!)

@cpuguy83
Copy link
Copy Markdown
Member Author

Hmm I'm not sure.
I hadn't used benchstat before...
benchcmp shows the delta.

@thaJeztah
Copy link
Copy Markdown
Member

@cpuguy83 @kzys @dmcgowan what's the status on this one? Do we want to have this merged and included in a release?

I went looking what changes are in main that are not yet released (following #32 (comment)), so was looking if we wanted to include this PR as well if we would be doing a new release.

@kzys
Copy link
Copy Markdown
Member

kzys commented Feb 19, 2022

Sorry. I have missed the ping. Let me take a look next week.

Copy link
Copy Markdown
Member

@samuelkarp samuelkarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions and nits, but otherwise LGTM.

Comment on lines +47 to +54
select {
case <-f.opened:
return f.readFrom(r)
default:
}
select {
case <-f.opened:
return f.readFrom(r)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of trying to read from <-f.opened twice here?

Comment on lines +75 to +77
if !ok {
return copyBuffer(f.file, r)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this error case also need to handle lr != nil?

Suggested change
if !ok {
return copyBuffer(f.file, r)
}
if !ok {
if lr != nil {
r = lr
}
return copyBuffer(f.file, r)
}

remain = spliceMax
}

// Hear the RawConn Read/Write methods allow us to utilize the go runtime
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// Hear the RawConn Read/Write methods allow us to utilize the go runtime
// Here the RawConn Read/Write methods allow us to utilize the go runtime

Comment on lines +199 to +206
case nil:
handled = true
if n == 0 {
// At EOF
return true
}
case unix.EINTR:
continue
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it'd be slightly more readable to have an explicit continue in here.

Suggested change
case nil:
handled = true
if n == 0 {
// At EOF
return true
}
case unix.EINTR:
continue
case nil:
handled = true
if n == 0 {
// At EOF
return true
}
continue
case unix.EINTR:
continue

Also, what do you think about reordering the switch to order successful cases before errors and have all the errors grouped together? Something like nil, unix.EINTR, unix.EAGAIN, unix.ENOSYS, syscall.EINVAL, syscall.EOPNOTSUPP, syscall.EPERM, default?

Comment on lines +113 to +121
select {
case <-f.opened:
return f.writeTo(w)
default:
}

select {
case <-f.opened:
return f.writeTo(w)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here about the repeated read from f.opened.

data := strings.Repeat("This is a test, this is only a test.", 1000)

// For these test cases we only call ReadFrom and validate there is no error and the
// amouont of data it copied is what we put into it.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// amouont of data it copied is what we put into it.
// amount of data it copied is what we put into it.

@dcantah
Copy link
Copy Markdown
Member

dcantah commented May 6, 2025

@cpuguy83 Have any bandwidth to try and push this forward? If not I'm happy to help, this would be sweet to have

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs Update

Development

Successfully merging this pull request may close these issues.

6 participants