Using `VectorMath::DotProduct()` in GatedRecurrentLayer to reuse existing
SIMD optimizations. Results:
- When SSE2/AVX2 is avilable, the GRU layer takes 40% of the unoptimized
code
- The realtime factor for the VAD improved as follows
- SSE2: from 570x to 630x
- AVX2: from 610x to 680x
This CL also improved the GRU layer benchmark by (i) benchmarking a GRU
layer havibng the same size of that used in the VAD and (ii) by prefetching
a long input sequence.
Bug: webrtc:10480
Change-Id: I9716b15661e4c6b81592b4cf7c172d90e41b5223
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/195545
Reviewed-by: Per Åhgren <peah@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#32803}
In preparation for adding AVX2 code, a safe scheme to support
different SIMD optimizations is added.
Safety features:
- AVX2 kill switch to stop using it even if supported by the
architecture
- struct indicating the available CPU features propagated from
AGC2 to each component; in this way
- better control over the unit tests
- no need to propagate individual kill switches but just
set to false features that are turned off
Note that (i) this CL does not change the performance of the RNN VAD
and (ii) no AVX2 optimization is added yet.
Bug: webrtc:10480
Change-Id: I0e61f3311ecd140f38369cf68b6e5954f3dc1f5a
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/193140
Reviewed-by: Per Åhgren <peah@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#32739}
This reverts commit ea89f2a447.
Reason for revert: bug in ancestor CL https://webrtc-review.googlesource.com/c/src/+/191320
Original change's description:
> RNN VAD: pitch search optimizations (part 3)
>
> `ComputeSlidingFrameSquareEnergies()` which computes the energy of a
> sliding 20 ms frame in the pitch buffer has been switched from backward
> to forward.
>
> The benchmark has shown a slight improvement (about +6x).
>
> This change is not bit exact but all the tolerance tests still pass
> except for one single case in `RnnVadTest,PitchSearchWithinTolerance`
> for which the tolerance has been slightly increased. Note that the pitch
> estimation is still bit-exact.
>
> Benchmarked as follows:
> ```
> out/release/modules_unittests \
> --gtest_filter=*RnnVadTest.DISABLED_RnnVadPerformance* \
> --gtest_also_run_disabled_tests --logs
> ```
>
> Results:
>
> | baseline | this CL
> ------+----------------------+------------------------
> run 1 | 22.8319 +/- 1.46554 | 22.087 +/- 0.552932
> | 389.367x | 402.499x
> ------+----------------------+------------------------
> run 2 | 22.4286 +/- 0.726449 | 22.216 +/- 0.916222
> | 396.369x | 400.162x
> ------+----------------------+------------------------
> run 2 | 22.5688 +/- 0.831341 | 22.4902 +/- 1.04881
> | 393.906x | 395.283x
>
> Bug: webrtc:10480
> Change-Id: I1fd54077a32e25e46196c8e18f003cd0ffd503e1
> Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/191703
> Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
> Reviewed-by: Karl Wiberg <kwiberg@webrtc.org>
> Cr-Commit-Position: refs/heads/master@{#32572}
TBR=alessiob@webrtc.org,kwiberg@webrtc.org
Change-Id: I57a8f937ade0a35e1ccf0e229c391cc3a10e7c48
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: webrtc:10480
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/192621
Reviewed-by: Alessio Bazzica <alessiob@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#32578}
`ComputeSlidingFrameSquareEnergies()` which computes the energy of a
sliding 20 ms frame in the pitch buffer has been switched from backward
to forward.
The benchmark has shown a slight improvement (about +6x).
This change is not bit exact but all the tolerance tests still pass
except for one single case in `RnnVadTest,PitchSearchWithinTolerance`
for which the tolerance has been slightly increased. Note that the pitch
estimation is still bit-exact.
Benchmarked as follows:
```
out/release/modules_unittests \
--gtest_filter=*RnnVadTest.DISABLED_RnnVadPerformance* \
--gtest_also_run_disabled_tests --logs
```
Results:
| baseline | this CL
------+----------------------+------------------------
run 1 | 22.8319 +/- 1.46554 | 22.087 +/- 0.552932
| 389.367x | 402.499x
------+----------------------+------------------------
run 2 | 22.4286 +/- 0.726449 | 22.216 +/- 0.916222
| 396.369x | 400.162x
------+----------------------+------------------------
run 2 | 22.5688 +/- 0.831341 | 22.4902 +/- 1.04881
| 393.906x | 395.283x
Bug: webrtc:10480
Change-Id: I1fd54077a32e25e46196c8e18f003cd0ffd503e1
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/191703
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Karl Wiberg <kwiberg@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#32572}
This CL adds the SSE2 optimized implementation for fully connected
(FC) layers. The change includes a weights re-alignment op done once
at construction time. It is required in order to optimize the load op
to fill 128 bit registers.
This CL also includes unit test adaptations and a benchmark test
(disabled by default).
Bug: webrtc:10480
Change-Id: I5ed87f0a629faaaf4c8bffbce1cea5557518f8c8
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/141862
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Gustaf Ullberg <gustaf@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#29712}
- add test that checks that the computed VAD probability is within
tolerance *1
- speed-up some tests by reducing the input length and skipping frames
- remove unused code in test_utils
- fix some comments
*1: RnnVadTest::RnnBitExactness is replaced by
RnnVadTest::RnnVadProbabilityWithinTolerance
Bug: webrtc:10480
Change-Id: I19332d06eacffbbe671bf7749ff4c92798bdc55c
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/133910
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#27803}
This CL refactors the computation of band energy and spectral
cross-correlation coefficients by moving and optimizing
the code from ComputeBandCoefficients, ComputeBandEnergies and
ComputeSpectralCrossCorrelation into a single class (named
BandFeaturesExtractor).
This change will also help replacing FFT library in the RNN VAD.
Bug: webrtc:10480
Change-Id: I6cefa23e8f3bc8de6eb09d3ea434699d5e19124e
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/129726
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Per Åhgren <peah@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#27535}
This CL adds helper functions to be used for the spectral features
computation. Namely, it includes the following:
- band boundaries (frequency to FFT coeffcient index)
- band energy coefficients
- log band energy coefficients
- fixed size DCT table and computation
Bug: webrtc:9076
Change-Id: I03a8799b226d986bc1e37cefd0c3039f94b5592a
Reviewed-on: https://webrtc-review.googlesource.com/73687
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Minyue Li <minyue@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23170}
RNN implementation for the AGC2 VAD that includes a fully connected
layer and a gated recurrent unit layer.
Bug: webrtc:9076
Change-Id: Ibb8b0b4e9213f09eb9dbe118bbdc94d7e8e4f91b
Reviewed-on: https://webrtc-review.googlesource.com/72060
Reviewed-by: Patrik Höglund <phoglund@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Ivo Creusen <ivoc@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23101}
Functions to estimate pitch period and gain.
Bug: webrtc:9076
Change-Id: Icfe9430dcae11bdb96165c5bfe6e2b1d3bf848ab
Reviewed-on: https://webrtc-review.googlesource.com/70382
Commit-Queue: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Per Åhgren <peah@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23066}
Functions to estimate the inverse filter via LPC and compute the LP
residual applying the inverse filter.
This CL also includes test utilities, in particular BinaryFileReader,
used to read chunks of data and optionally cast them on the fly, and
Create*Reader() functions to read resource files available at test
time.
Bug: webrtc:9076
Change-Id: Ia4793b8ad6a63cb3089ed11ddad89d1aa0b840f6
Reviewed-on: https://webrtc-review.googlesource.com/70244
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Jesus de Vicente Pena <devicentepena@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22946}