xref: /aosp_15_r20/external/libdivsufsort/README.md (revision 30b9430b2d8672faf9045aa522d63599a84e8e49)
1*30b9430bSXin Li# libdivsufsort
2*30b9430bSXin Li
3*30b9430bSXin Lilibdivsufsort is a software library that implements a lightweight suffix array construction algorithm.
4*30b9430bSXin Li
5*30b9430bSXin Li## News
6*30b9430bSXin Li* 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort)
7*30b9430bSXin Li
8*30b9430bSXin Li## Introduction
9*30b9430bSXin LiThis library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet.
10*30b9430bSXin LiThe algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of
11*30b9430bSXin Lithe string.
12*30b9430bSXin Li
13*30b9430bSXin Li## Build requirements
14*30b9430bSXin Li* An ANSI C Compiler (e.g. GNU GCC)
15*30b9430bSXin Li* [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer
16*30b9430bSXin Li* CMake-supported build tool
17*30b9430bSXin Li
18*30b9430bSXin Li## Building on GNU/Linux
19*30b9430bSXin Li1. Get the source code from GitHub. You can either
20*30b9430bSXin Li    * use git to clone the repository
21*30b9430bSXin Li    ```
22*30b9430bSXin Li    git clone https://github.com/y-256/libdivsufsort.git
23*30b9430bSXin Li    ```
24*30b9430bSXin Li    * or download a [zip file](../../archive/master.zip) directly
25*30b9430bSXin Li2. Create a `build` directory in the package source directory.
26*30b9430bSXin Li```shell
27*30b9430bSXin Li$ cd libdivsufsort
28*30b9430bSXin Li$ mkdir build
29*30b9430bSXin Li$ cd build
30*30b9430bSXin Li```
31*30b9430bSXin Li3. Configure the package for your system.
32*30b9430bSXin LiIf you want to install to a different location,  change the -DCMAKE_INSTALL_PREFIX option.
33*30b9430bSXin Li```shell
34*30b9430bSXin Li$ cmake -DCMAKE_BUILD_TYPE="Release" \
35*30b9430bSXin Li-DCMAKE_INSTALL_PREFIX="/usr/local" ..
36*30b9430bSXin Li```
37*30b9430bSXin Li4. Compile the package.
38*30b9430bSXin Li```shell
39*30b9430bSXin Li$ make
40*30b9430bSXin Li```
41*30b9430bSXin Li5. (Optional) Install the library and header files.
42*30b9430bSXin Li```shell
43*30b9430bSXin Li$ sudo make install
44*30b9430bSXin Li```
45*30b9430bSXin Li
46*30b9430bSXin Li## API
47*30b9430bSXin Li```c
48*30b9430bSXin Li/* Data types */
49*30b9430bSXin Litypedef int32_t saint_t;
50*30b9430bSXin Litypedef int32_t saidx_t;
51*30b9430bSXin Litypedef uint8_t sauchar_t;
52*30b9430bSXin Li
53*30b9430bSXin Li/*
54*30b9430bSXin Li * Constructs the suffix array of a given string.
55*30b9430bSXin Li * @param T[0..n-1] The input string.
56*30b9430bSXin Li * @param SA[0..n-1] The output array or suffixes.
57*30b9430bSXin Li * @param n The length of the given string.
58*30b9430bSXin Li * @return 0 if no error occurred, -1 or -2 otherwise.
59*30b9430bSXin Li */
60*30b9430bSXin Lisaint_t
61*30b9430bSXin Lidivsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n);
62*30b9430bSXin Li
63*30b9430bSXin Li/*
64*30b9430bSXin Li * Constructs the burrows-wheeler transformed string of a given string.
65*30b9430bSXin Li * @param T[0..n-1] The input string.
66*30b9430bSXin Li * @param U[0..n-1] The output string. (can be T)
67*30b9430bSXin Li * @param A[0..n-1] The temporary array. (can be NULL)
68*30b9430bSXin Li * @param n The length of the given string.
69*30b9430bSXin Li * @return The primary index if no error occurred, -1 or -2 otherwise.
70*30b9430bSXin Li */
71*30b9430bSXin Lisaidx_t
72*30b9430bSXin Lidivbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n);
73*30b9430bSXin Li```
74*30b9430bSXin Li
75*30b9430bSXin Li## Example Usage
76*30b9430bSXin Li```c
77*30b9430bSXin Li#include <stdio.h>
78*30b9430bSXin Li#include <stdlib.h>
79*30b9430bSXin Li#include <string.h>
80*30b9430bSXin Li
81*30b9430bSXin Li#include <divsufsort.h>
82*30b9430bSXin Li
83*30b9430bSXin Liint main() {
84*30b9430bSXin Li    // intput data
85*30b9430bSXin Li    char *Text = "abracadabra";
86*30b9430bSXin Li    int n = strlen(Text);
87*30b9430bSXin Li    int i, j;
88*30b9430bSXin Li
89*30b9430bSXin Li    // allocate
90*30b9430bSXin Li    int *SA = (int *)malloc(n * sizeof(int));
91*30b9430bSXin Li
92*30b9430bSXin Li    // sort
93*30b9430bSXin Li    divsufsort((unsigned char *)Text, SA, n);
94*30b9430bSXin Li
95*30b9430bSXin Li    // output
96*30b9430bSXin Li    for(i = 0; i < n; ++i) {
97*30b9430bSXin Li        printf("SA[%2d] = %2d: ", i, SA[i]);
98*30b9430bSXin Li        for(j = SA[i]; j < n; ++j) {
99*30b9430bSXin Li            printf("%c", Text[j]);
100*30b9430bSXin Li        }
101*30b9430bSXin Li        printf("$\n");
102*30b9430bSXin Li    }
103*30b9430bSXin Li
104*30b9430bSXin Li    // deallocate
105*30b9430bSXin Li    free(SA);
106*30b9430bSXin Li
107*30b9430bSXin Li    return 0;
108*30b9430bSXin Li}
109*30b9430bSXin Li```
110*30b9430bSXin LiSee the [examples](examples) directory for a few other examples.
111*30b9430bSXin Li
112*30b9430bSXin Li## Benchmarks
113*30b9430bSXin LiSee [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details.
114*30b9430bSXin Li
115*30b9430bSXin Li## License
116*30b9430bSXin Lilibdivsufsort is released under the [MIT license](LICENSE "MIT license").
117*30b9430bSXin Li> The MIT License (MIT)
118*30b9430bSXin Li>
119*30b9430bSXin Li> Copyright (c) 2003 Yuta Mori All rights reserved.
120*30b9430bSXin Li>
121*30b9430bSXin Li> Permission is hereby granted, free of charge, to any person obtaining a copy
122*30b9430bSXin Li> of this software and associated documentation files (the "Software"), to deal
123*30b9430bSXin Li> in the Software without restriction, including without limitation the rights
124*30b9430bSXin Li> to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
125*30b9430bSXin Li> copies of the Software, and to permit persons to whom the Software is
126*30b9430bSXin Li> furnished to do so, subject to the following conditions:
127*30b9430bSXin Li>
128*30b9430bSXin Li> The above copyright notice and this permission notice shall be included in all
129*30b9430bSXin Li> copies or substantial portions of the Software.
130*30b9430bSXin Li>
131*30b9430bSXin Li> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
132*30b9430bSXin Li> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
133*30b9430bSXin Li> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
134*30b9430bSXin Li> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
135*30b9430bSXin Li> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
136*30b9430bSXin Li> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
137*30b9430bSXin Li> SOFTWARE.
138*30b9430bSXin Li
139*30b9430bSXin Li## Author
140*30b9430bSXin Li* Yuta Mori
141