1*30b9430bSXin Li# libdivsufsort 2*30b9430bSXin Li 3*30b9430bSXin Lilibdivsufsort is a software library that implements a lightweight suffix array construction algorithm. 4*30b9430bSXin Li 5*30b9430bSXin Li## News 6*30b9430bSXin Li* 2015-03-21: The project has moved from [Google Code](http://code.google.com/p/libdivsufsort/) to [GitHub](https://github.com/y-256/libdivsufsort) 7*30b9430bSXin Li 8*30b9430bSXin Li## Introduction 9*30b9430bSXin LiThis library provides a simple and an efficient C API to construct a suffix array and a Burrows-Wheeler transformed string from a given string over a constant-size alphabet. 10*30b9430bSXin LiThe algorithm runs in O(n log n) worst-case time using only 5n+O(1) bytes of memory space, where n is the length of 11*30b9430bSXin Lithe string. 12*30b9430bSXin Li 13*30b9430bSXin Li## Build requirements 14*30b9430bSXin Li* An ANSI C Compiler (e.g. GNU GCC) 15*30b9430bSXin Li* [CMake](http://www.cmake.org/ "CMake") version 2.4.2 or newer 16*30b9430bSXin Li* CMake-supported build tool 17*30b9430bSXin Li 18*30b9430bSXin Li## Building on GNU/Linux 19*30b9430bSXin Li1. Get the source code from GitHub. You can either 20*30b9430bSXin Li * use git to clone the repository 21*30b9430bSXin Li ``` 22*30b9430bSXin Li git clone https://github.com/y-256/libdivsufsort.git 23*30b9430bSXin Li ``` 24*30b9430bSXin Li * or download a [zip file](../../archive/master.zip) directly 25*30b9430bSXin Li2. Create a `build` directory in the package source directory. 26*30b9430bSXin Li```shell 27*30b9430bSXin Li$ cd libdivsufsort 28*30b9430bSXin Li$ mkdir build 29*30b9430bSXin Li$ cd build 30*30b9430bSXin Li``` 31*30b9430bSXin Li3. Configure the package for your system. 32*30b9430bSXin LiIf you want to install to a different location, change the -DCMAKE_INSTALL_PREFIX option. 33*30b9430bSXin Li```shell 34*30b9430bSXin Li$ cmake -DCMAKE_BUILD_TYPE="Release" \ 35*30b9430bSXin Li-DCMAKE_INSTALL_PREFIX="/usr/local" .. 36*30b9430bSXin Li``` 37*30b9430bSXin Li4. Compile the package. 38*30b9430bSXin Li```shell 39*30b9430bSXin Li$ make 40*30b9430bSXin Li``` 41*30b9430bSXin Li5. (Optional) Install the library and header files. 42*30b9430bSXin Li```shell 43*30b9430bSXin Li$ sudo make install 44*30b9430bSXin Li``` 45*30b9430bSXin Li 46*30b9430bSXin Li## API 47*30b9430bSXin Li```c 48*30b9430bSXin Li/* Data types */ 49*30b9430bSXin Litypedef int32_t saint_t; 50*30b9430bSXin Litypedef int32_t saidx_t; 51*30b9430bSXin Litypedef uint8_t sauchar_t; 52*30b9430bSXin Li 53*30b9430bSXin Li/* 54*30b9430bSXin Li * Constructs the suffix array of a given string. 55*30b9430bSXin Li * @param T[0..n-1] The input string. 56*30b9430bSXin Li * @param SA[0..n-1] The output array or suffixes. 57*30b9430bSXin Li * @param n The length of the given string. 58*30b9430bSXin Li * @return 0 if no error occurred, -1 or -2 otherwise. 59*30b9430bSXin Li */ 60*30b9430bSXin Lisaint_t 61*30b9430bSXin Lidivsufsort(const sauchar_t *T, saidx_t *SA, saidx_t n); 62*30b9430bSXin Li 63*30b9430bSXin Li/* 64*30b9430bSXin Li * Constructs the burrows-wheeler transformed string of a given string. 65*30b9430bSXin Li * @param T[0..n-1] The input string. 66*30b9430bSXin Li * @param U[0..n-1] The output string. (can be T) 67*30b9430bSXin Li * @param A[0..n-1] The temporary array. (can be NULL) 68*30b9430bSXin Li * @param n The length of the given string. 69*30b9430bSXin Li * @return The primary index if no error occurred, -1 or -2 otherwise. 70*30b9430bSXin Li */ 71*30b9430bSXin Lisaidx_t 72*30b9430bSXin Lidivbwt(const sauchar_t *T, sauchar_t *U, saidx_t *A, saidx_t n); 73*30b9430bSXin Li``` 74*30b9430bSXin Li 75*30b9430bSXin Li## Example Usage 76*30b9430bSXin Li```c 77*30b9430bSXin Li#include <stdio.h> 78*30b9430bSXin Li#include <stdlib.h> 79*30b9430bSXin Li#include <string.h> 80*30b9430bSXin Li 81*30b9430bSXin Li#include <divsufsort.h> 82*30b9430bSXin Li 83*30b9430bSXin Liint main() { 84*30b9430bSXin Li // intput data 85*30b9430bSXin Li char *Text = "abracadabra"; 86*30b9430bSXin Li int n = strlen(Text); 87*30b9430bSXin Li int i, j; 88*30b9430bSXin Li 89*30b9430bSXin Li // allocate 90*30b9430bSXin Li int *SA = (int *)malloc(n * sizeof(int)); 91*30b9430bSXin Li 92*30b9430bSXin Li // sort 93*30b9430bSXin Li divsufsort((unsigned char *)Text, SA, n); 94*30b9430bSXin Li 95*30b9430bSXin Li // output 96*30b9430bSXin Li for(i = 0; i < n; ++i) { 97*30b9430bSXin Li printf("SA[%2d] = %2d: ", i, SA[i]); 98*30b9430bSXin Li for(j = SA[i]; j < n; ++j) { 99*30b9430bSXin Li printf("%c", Text[j]); 100*30b9430bSXin Li } 101*30b9430bSXin Li printf("$\n"); 102*30b9430bSXin Li } 103*30b9430bSXin Li 104*30b9430bSXin Li // deallocate 105*30b9430bSXin Li free(SA); 106*30b9430bSXin Li 107*30b9430bSXin Li return 0; 108*30b9430bSXin Li} 109*30b9430bSXin Li``` 110*30b9430bSXin LiSee the [examples](examples) directory for a few other examples. 111*30b9430bSXin Li 112*30b9430bSXin Li## Benchmarks 113*30b9430bSXin LiSee [Benchmarks](https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md) page for details. 114*30b9430bSXin Li 115*30b9430bSXin Li## License 116*30b9430bSXin Lilibdivsufsort is released under the [MIT license](LICENSE "MIT license"). 117*30b9430bSXin Li> The MIT License (MIT) 118*30b9430bSXin Li> 119*30b9430bSXin Li> Copyright (c) 2003 Yuta Mori All rights reserved. 120*30b9430bSXin Li> 121*30b9430bSXin Li> Permission is hereby granted, free of charge, to any person obtaining a copy 122*30b9430bSXin Li> of this software and associated documentation files (the "Software"), to deal 123*30b9430bSXin Li> in the Software without restriction, including without limitation the rights 124*30b9430bSXin Li> to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 125*30b9430bSXin Li> copies of the Software, and to permit persons to whom the Software is 126*30b9430bSXin Li> furnished to do so, subject to the following conditions: 127*30b9430bSXin Li> 128*30b9430bSXin Li> The above copyright notice and this permission notice shall be included in all 129*30b9430bSXin Li> copies or substantial portions of the Software. 130*30b9430bSXin Li> 131*30b9430bSXin Li> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 132*30b9430bSXin Li> IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 133*30b9430bSXin Li> FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 134*30b9430bSXin Li> AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 135*30b9430bSXin Li> LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 136*30b9430bSXin Li> OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 137*30b9430bSXin Li> SOFTWARE. 138*30b9430bSXin Li 139*30b9430bSXin Li## Author 140*30b9430bSXin Li* Yuta Mori 141