webrtc/modules/audio_processing/test/conversational_speech
Patrik Höglund 3e113438b1 Fix circular dependencies in webrtc_common.
One reason for the circular deps is that common_types.h is a
historical dumping ground for various structs and defines that
are believed to be generally useful. I tried moving things out
that did not appear to be used downstream (StreamCounters,
RtpCounters etc) and moved the things that seemed used
(RtpHeader + supporting structs) to a new file api/rtp_headers.h.
This makes their place in the api more clear while moving out
the things that don't belong in the API in the first place.

I had to extract out typedefs.h from webrtc_common to resolve
another circular dependency. I believe checks includes typedefs,
but common depends on checks.

Bug: webrtc:7745
Change-Id: I725d49616b1ec0cdc8b74be7c078f7a4d46f084b
Reviewed-on: https://webrtc-review.googlesource.com/33001
Commit-Queue: Patrik Höglund <phoglund@webrtc.org>
Reviewed-by: Karl Wiberg <kwiberg@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#21295}
2017-12-15 14:33:26 +00:00
..
BUILD.gn Fix circular dependencies in webrtc_common. 2017-12-15 14:33:26 +00:00
config.cc Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
config.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
generator.cc Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
generator_unittest.cc Stop using LOG macros in favor of RTC_ prefixed macros. 2017-11-09 11:56:32 +00:00
mock_wavreader.cc Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
mock_wavreader.h Adding NOLINT for typedefs.h and common_types.h 2017-09-15 13:03:51 +00:00
mock_wavreader_factory.cc Stop using LOG macros in favor of RTC_ prefixed macros. 2017-11-09 11:56:32 +00:00
mock_wavreader_factory.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
multiend_call.cc Stop using LOG macros in favor of RTC_ prefixed macros. 2017-11-09 11:56:32 +00:00
multiend_call.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
OWNERS Moving src/webrtc into src/. 2017-09-15 04:25:06 +00:00
README.md Moving src/webrtc into src/. 2017-09-15 04:25:06 +00:00
simulator.cc Stop using LOG macros in favor of RTC_ prefixed macros. 2017-11-09 11:56:32 +00:00
simulator.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
timing.cc Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
timing.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
wavreader_abstract_factory.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
wavreader_factory.cc Adding NOLINT for typedefs.h and common_types.h 2017-09-15 13:03:51 +00:00
wavreader_factory.h Fixing WebRTC after moving from src/webrtc to src/ 2017-09-15 05:02:56 +00:00
wavreader_interface.h Adding NOLINT for typedefs.h and common_types.h 2017-09-15 13:03:51 +00:00

Conversational Speech generator tool

Tool to generate multiple-end audio tracks to simulate conversational speech with two or more participants.

The input to the tool is a directory containing a number of audio tracks and a text file indicating how to time the sequence of speech turns (see the Example section).

Since the timing of the speaking turns is specified by the user, the generated tracks may not be suitable for testing scenarios in which there is unpredictable network delay (e.g., end-to-end RTC assessment).

Instead, the generated pairs can be used when the delay is constant (obviously including the case in which there is no delay). For instance, echo cancellation in the APM module can be evaluated using two-end audio tracks as input and reverse input.

By indicating negative and positive time offsets, one can reproduce cross-talk (aka double-talk) and silence in the conversation.

Example

For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A) and b1, b2 (speaker B). The text file with the timing information may look like this:

A a1 0
B b1 0
A a2 100
B b2 -200
A a3 0
A a4 0

The first column indicates the speaker name, the second contains the audio track file names, and the third the offsets (in milliseconds) used to concatenate the chunks.

Assume that all the audio tracks in the example above are 1000 ms long. The tool will then generate two tracks (A and B) that look like this:

Track A

  a1 (1000 ms)
  silence (1100 ms)
  a2 (1000 ms)
  silence (800 ms)
  a3 (1000 ms)
  a4 (1000 ms)

Track B

  silence (1000 ms)
  b1 (1000 ms)
  silence (900 ms)
  b2 (1000 ms)
  silence (2000 ms)

The two tracks can be also visualized as follows (one characheter represents 100 ms, "." is silence and "*" is speech).

t: 0         1        2        3        4        5        6 (s)
A: **********...........**********........********************
B: ..........**********.........**********....................
                                ^ 200 ms cross-talk
        100 ms silence ^