Add the following code to your project's shard.yml under:
to use in production
- OR -
development_dependencies to use in development
I was curious if (like in some other languages) evaluating numeric representations of strings was faster than their character representations.
The short answer is: in Crystal
codepoints is negligibly faster.
The following is example output performed on a ThinkPad E480.
Benchmark #1: ../bin/hamming --chars Time (mean ± σ): 3.263 s ± 0.029 s [User: 2.276 s, System: 0.975 s] Range (min … max): 3.247 s … 3.375 s 50 runs Benchmark #2: ../bin/hamming --codepoints Time (mean ± σ): 3.251 s ± 0.040 s [User: 2.265 s, System: 0.974 s] Range (min … max): 3.228 s … 3.395 s 50 runs Summary '../bin/hamming --codepoints' ran 1.00 ± 0.02 times faster than '../bin/hamming --chars'
The results are fairly stable,
codepoints always beat
chars, but the
difference is tiny. I wouldn't look here for an optimization, especially if you
need to actually do anything with the characters (like display).
The code under benchmark (
src/hamming.cr) is calculating the Hamming distance
between two files of 100,000,000 characters each. The test (
test/bench) is a
bash script that generates test data (using
src/gen_input.cr) and runs
hyperfine for a performance comparision.
shards build --release