Commit Graph

3 Commits (07dff3ba4e214a95bc247d567c6441edbc67938e)

Author SHA1 Message Date
Mike Gerwitz 1ad2fb1dc8 Copyright year update 2022
RSG (Ryan Specialty Group) recently announced a rename to Ryan Specialty (no
"Group"), but I'm not sure if the legal name has been changed yet or not, so
I'll wait on that.
2022-05-03 14:14:29 -04:00
Mike Gerwitz c211ada89b tamer: benches (memchr): Add missing bench attr
This benchmark was not being run.
2021-08-19 23:14:33 -04:00
Mike Gerwitz fc235b7ecc tamer: memchr benches
This adds benchmarking for the memchr crate.  It is used primarily by
quick-xml at the moment, but the question is whether to rely on it for
certain operations for XIR.

The benchmarking on an Intel Xeon system shows that memchr and Rust's
contains() perform very similarly on small inputs, matching against a single
character, and so Rust's built-in should be preferred in that case so that
we're using APIs that are familiar to most people.

When larger inputs are compared against, there's a greater benefit (a little
under ~2x).

When comparing against two characters, they are again very close.  But look
at when we compare two characters against _multiple_ inputs:

  running 24 tests
  test large_str:1️⃣:memchr_early_match                 ... bench:       4,938 ns/iter (+/- 124)
  test large_str:1️⃣:memchr_late_match                  ... bench:      81,807 ns/iter (+/- 1,153)
  test large_str:1️⃣:memchr_non_match                   ... bench:      82,074 ns/iter (+/- 1,062)
  test large_str:1️⃣:rust_contains_one_byte_early_match ... bench:       9,425 ns/iter (+/- 167)
  test large_str:1️⃣:rust_contains_one_byte_late_match  ... bench:     123,685 ns/iter (+/- 3,728)
  test large_str:1️⃣:rust_contains_one_byte_non_match   ... bench:     123,117 ns/iter (+/- 2,200)
  test large_str:1️⃣:rust_contains_one_char_early_match ... bench:       9,561 ns/iter (+/- 507)
  test large_str:1️⃣:rust_contains_one_char_late_match  ... bench:     123,929 ns/iter (+/- 2,377)
  test large_str:1️⃣:rust_contains_one_char_non_match   ... bench:     122,989 ns/iter (+/- 2,788)
  test large_str:2️⃣:memchr2_early_match                ... bench:       5,704 ns/iter (+/- 91)
  test large_str:2️⃣:memchr2_late_match                 ... bench:      89,194 ns/iter (+/- 8,546)
  test large_str:2️⃣:memchr2_non_match                  ... bench:      85,649 ns/iter (+/- 3,879)
  test large_str:2️⃣:rust_contains_two_char_early_match ... bench:      66,785 ns/iter (+/- 3,385)
  test large_str:2️⃣:rust_contains_two_char_late_match  ... bench:   2,148,064 ns/iter (+/- 21,812)
  test large_str:2️⃣:rust_contains_two_char_non_match   ... bench:   2,322,082 ns/iter (+/- 22,947)
  test small_str:1️⃣:memchr_mid_match                   ... bench:       4,737 ns/iter (+/- 842)
  test small_str:1️⃣:memchr_non_match                   ... bench:       5,160 ns/iter (+/- 62)
  test small_str:1️⃣:rust_contains_one_byte_non_match   ... bench:       3,930 ns/iter (+/- 35)
  test small_str:1️⃣:rust_contains_one_char_mid_match   ... bench:       3,677 ns/iter (+/- 618)
  test small_str:1️⃣:rust_contains_one_char_non_match   ... bench:       5,415 ns/iter (+/- 221)
  test small_str:2️⃣:memchr2_mid_match                  ... bench:       5,488 ns/iter (+/- 888)
  test small_str:2️⃣:memchr2_non_match                  ... bench:       6,788 ns/iter (+/- 134)
  test small_str:2️⃣:rust_contains_two_char_mid_match   ... bench:       6,203 ns/iter (+/- 170)
  test small_str:2️⃣:rust_contains_two_char_non_match   ... bench:       7,853 ns/iter (+/- 713)

Yikes.

With that said, we won't be comparing against such large inputs
short-term.  The larger strings (fragments) are copied verbatim, and not
compared against---but they _were_ prior to the previous commit that stopped
unencoding and re-encoding.

So: Rust built-ins for inputs that are expected to be small.
2021-08-18 14:23:03 -04:00