Skip to content

Add read_wav and list_wav_info support#406

Merged
yongtang merged 2 commits intotensorflow:masterfrom
yongtang:audio
Aug 4, 2019
Merged

Add read_wav and list_wav_info support#406
yongtang merged 2 commits intotensorflow:masterfrom
yongtang:audio

Conversation

@yongtang
Copy link
Member

@yongtang yongtang commented Aug 2, 2019

This PR adds read_wav and list_wav_info support so that it is possible to read wav file into a tensor.
Since WAV file is splittable, read_wav could take a start and count parameter so that only a slice of the wav file is read.

Read a chunk of the wav file is related to #49 (comment)

This is also part of the effort to rework on Dataset to move to primitive ops. See See #382 and #366 for related discussions.

Signed-off-by: Yong Tang yong.tang.github@outlook.com

This PR adds read_wav and list_wav_info support
so that it is possible to read wav file into a tensor.
Since WAV file is splittable, read_wav could take a
start and count parameter so that only a slice
of the wav file is read.

This is also part of the effort to rework on Dataset
to move to primitive ops.

See See 382 and 366 for related discussions.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
@yongtang
Copy link
Member Author

yongtang commented Aug 2, 2019

/cc @faroit for reading a chunk of WAV file. You can use read_wav and pass start=start, count=count to only read a chunk of the samples.

Note start stand for the starting index of the sample.

The total number of samples in WAV could be obtained through:
spec, rate = list_wav_info(filename) where spec is a TensorSpec which gives you the shape, and dtype of the tensor. Shape is [n, c] where n is the total number of samples and c is the channels.

The dtype could be int8 or int16 for 8-bit and 16-bit.

I tried to see if I could also add 24bit, though I need a sample 24bit WAV file to see the actual memory layout (to figure out how to fit 24bit into an int32)

@faroit
Copy link

faroit commented Aug 2, 2019

@yongtang This is great!

count sounds a bit wobbly, I would propose to use either

  • frames or num_frames (samples*channel)
  • length (duration in samples)

or to set the end sample in samples

  • stop
  • end

@faroit
Copy link

faroit commented Aug 2, 2019

@yongtang before I try it: is the wavdataset form this PR compatible to the tf 2.x beta?

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
@faroit
Copy link

faroit commented Aug 4, 2019

Looks great, thanks

@yongtang yongtang merged commit 8af0a01 into tensorflow:master Aug 4, 2019
@yongtang yongtang deleted the audio branch August 4, 2019 14:53
i-ony pushed a commit to i-ony/io that referenced this pull request Feb 8, 2021
* Add read_wav and list_wav_info support

This PR adds read_wav and list_wav_info support
so that it is possible to read wav file into a tensor.
Since WAV file is splittable, read_wav could take a
start and count parameter so that only a slice
of the wav file is read.

This is also part of the effort to rework on Dataset
to move to primitive ops.

See See 382 and 366 for related discussions.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Rename start, count to start, stop

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants