xref: /third_party/rust/crates/strsim-rs/README.md (revision 82e69de5)
182e69de5Sopenharmony_ci# strsim-rs
282e69de5Sopenharmony_ci
382e69de5Sopenharmony_ci[![Crates.io](https://img.shields.io/crates/v/strsim.svg)](https://crates.io/crates/strsim)
482e69de5Sopenharmony_ci[![Crates.io](https://img.shields.io/crates/l/strsim.svg?maxAge=2592000)](https://github.com/dguo/strsim-rs/blob/master/LICENSE)
582e69de5Sopenharmony_ci[![CI status](https://github.com/dguo/strsim-rs/workflows/CI/badge.svg)](https://github.com/dguo/strsim-rs/actions?query=branch%3Amaster)
682e69de5Sopenharmony_ci[![unsafe forbidden](https://img.shields.io/badge/unsafe-forbidden-success.svg)](https://github.com/rust-secure-code/safety-dance/)
782e69de5Sopenharmony_ci
882e69de5Sopenharmony_ci[Rust](https://www.rust-lang.org) implementations of [string similarity metrics]:
982e69de5Sopenharmony_ci  - [Hamming]
1082e69de5Sopenharmony_ci  - [Levenshtein] - distance & normalized
1182e69de5Sopenharmony_ci  - [Optimal string alignment]
1282e69de5Sopenharmony_ci  - [Damerau-Levenshtein] - distance & normalized
1382e69de5Sopenharmony_ci  - [Jaro and Jaro-Winkler] - this implementation of Jaro-Winkler does not limit the common prefix length
1482e69de5Sopenharmony_ci  - [Sørensen-Dice]
1582e69de5Sopenharmony_ci
1682e69de5Sopenharmony_ciThe normalized versions return values between `0.0` and `1.0`, where `1.0` means
1782e69de5Sopenharmony_cian exact match.
1882e69de5Sopenharmony_ci
1982e69de5Sopenharmony_ciThere are also generic versions of the functions for non-string inputs.
2082e69de5Sopenharmony_ci
2182e69de5Sopenharmony_ci## Installation
2282e69de5Sopenharmony_ci
2382e69de5Sopenharmony_ci`strsim` is available on [crates.io](https://crates.io/crates/strsim). Add it to
2482e69de5Sopenharmony_ciyour `Cargo.toml`:
2582e69de5Sopenharmony_ci```toml
2682e69de5Sopenharmony_ci[dependencies]
2782e69de5Sopenharmony_cistrsim = "0.10.0"
2882e69de5Sopenharmony_ci```
2982e69de5Sopenharmony_ci
3082e69de5Sopenharmony_ci## Usage
3182e69de5Sopenharmony_ci
3282e69de5Sopenharmony_ciGo to [Docs.rs](https://docs.rs/strsim/) for the full documentation. You can
3382e69de5Sopenharmony_cialso clone the repo, and run `$ cargo doc --open`.
3482e69de5Sopenharmony_ci
3582e69de5Sopenharmony_ci### Examples
3682e69de5Sopenharmony_ci
3782e69de5Sopenharmony_ci```rust
3882e69de5Sopenharmony_ciextern crate strsim;
3982e69de5Sopenharmony_ci
4082e69de5Sopenharmony_ciuse strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance,
4182e69de5Sopenharmony_ci             damerau_levenshtein, normalized_damerau_levenshtein, jaro,
4282e69de5Sopenharmony_ci             jaro_winkler, sorensen_dice};
4382e69de5Sopenharmony_ci
4482e69de5Sopenharmony_cifn main() {
4582e69de5Sopenharmony_ci    match hamming("hamming", "hammers") {
4682e69de5Sopenharmony_ci        Ok(distance) => assert_eq!(3, distance),
4782e69de5Sopenharmony_ci        Err(why) => panic!("{:?}", why)
4882e69de5Sopenharmony_ci    }
4982e69de5Sopenharmony_ci
5082e69de5Sopenharmony_ci    assert_eq!(levenshtein("kitten", "sitting"), 3);
5182e69de5Sopenharmony_ci
5282e69de5Sopenharmony_ci    assert!((normalized_levenshtein("kitten", "sitting") - 0.571).abs() < 0.001);
5382e69de5Sopenharmony_ci
5482e69de5Sopenharmony_ci    assert_eq!(osa_distance("ac", "cba"), 3);
5582e69de5Sopenharmony_ci
5682e69de5Sopenharmony_ci    assert_eq!(damerau_levenshtein("ac", "cba"), 2);
5782e69de5Sopenharmony_ci
5882e69de5Sopenharmony_ci    assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.272).abs() <
5982e69de5Sopenharmony_ci            0.001);
6082e69de5Sopenharmony_ci
6182e69de5Sopenharmony_ci    assert!((jaro("Friedrich Nietzsche", "Jean-Paul Sartre") - 0.392).abs() <
6282e69de5Sopenharmony_ci            0.001);
6382e69de5Sopenharmony_ci
6482e69de5Sopenharmony_ci    assert!((jaro_winkler("cheeseburger", "cheese fries") - 0.911).abs() <
6582e69de5Sopenharmony_ci            0.001);
6682e69de5Sopenharmony_ci
6782e69de5Sopenharmony_ci    assert_eq!(sorensen_dice("web applications", "applications of the web"),
6882e69de5Sopenharmony_ci        0.7878787878787878);
6982e69de5Sopenharmony_ci}
7082e69de5Sopenharmony_ci```
7182e69de5Sopenharmony_ci
7282e69de5Sopenharmony_ciUsing the generic versions of the functions:
7382e69de5Sopenharmony_ci
7482e69de5Sopenharmony_ci```rust
7582e69de5Sopenharmony_ciextern crate strsim;
7682e69de5Sopenharmony_ci
7782e69de5Sopenharmony_ciuse strsim::generic_levenshtein;
7882e69de5Sopenharmony_ci
7982e69de5Sopenharmony_cifn main() {
8082e69de5Sopenharmony_ci    assert_eq!(2, generic_levenshtein(&[1, 2, 3], &[0, 2, 5]));
8182e69de5Sopenharmony_ci}
8282e69de5Sopenharmony_ci```
8382e69de5Sopenharmony_ci
8482e69de5Sopenharmony_ci## Contributing
8582e69de5Sopenharmony_ci
8682e69de5Sopenharmony_ciIf you don't want to install Rust itself, you can run `$ ./dev` for a
8782e69de5Sopenharmony_cidevelopment CLI if you have [Docker] installed.
8882e69de5Sopenharmony_ci
8982e69de5Sopenharmony_ciBenchmarks require a Nightly toolchain. Run `$ cargo +nightly bench`.
9082e69de5Sopenharmony_ci
9182e69de5Sopenharmony_ci## License
9282e69de5Sopenharmony_ci
9382e69de5Sopenharmony_ci[MIT](https://github.com/dguo/strsim-rs/blob/master/LICENSE)
9482e69de5Sopenharmony_ci
9582e69de5Sopenharmony_ci[string similarity metrics]:http://en.wikipedia.org/wiki/String_metric
9682e69de5Sopenharmony_ci[Damerau-Levenshtein]:http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
9782e69de5Sopenharmony_ci[Jaro and Jaro-Winkler]:http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
9882e69de5Sopenharmony_ci[Levenshtein]:http://en.wikipedia.org/wiki/Levenshtein_distance
9982e69de5Sopenharmony_ci[Hamming]:http://en.wikipedia.org/wiki/Hamming_distance
10082e69de5Sopenharmony_ci[Optimal string alignment]:https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#Optimal_string_alignment_distance
10182e69de5Sopenharmony_ci[Sørensen-Dice]:http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient
10282e69de5Sopenharmony_ci[Docker]:https://docs.docker.com/engine/installation/
103