1e1051a39Sopenharmony_ciSparse Arrays 2e1051a39Sopenharmony_ci============= 3e1051a39Sopenharmony_ci 4e1051a39Sopenharmony_ciThe `sparse_array.c` file contains an implementation of a sparse array that 5e1051a39Sopenharmony_ciattempts to be both space and time efficient. 6e1051a39Sopenharmony_ci 7e1051a39Sopenharmony_ciThe sparse array is represented using a tree structure. Each node in the 8e1051a39Sopenharmony_citree contains a block of pointers to either the user supplied leaf values or 9e1051a39Sopenharmony_cito another node. 10e1051a39Sopenharmony_ci 11e1051a39Sopenharmony_ciThere are a number of parameters used to define the block size: 12e1051a39Sopenharmony_ci 13e1051a39Sopenharmony_ci OPENSSL_SA_BLOCK_BITS Specifies the number of bits covered by each block 14e1051a39Sopenharmony_ci SA_BLOCK_MAX Specifies the number of pointers in each block 15e1051a39Sopenharmony_ci SA_BLOCK_MASK Specifies a bit mask to perform modulo block size 16e1051a39Sopenharmony_ci SA_BLOCK_MAX_LEVELS Indicates the maximum possible height of the tree 17e1051a39Sopenharmony_ci 18e1051a39Sopenharmony_ciThese constants are inter-related: 19e1051a39Sopenharmony_ci 20e1051a39Sopenharmony_ci SA_BLOCK_MAX = 2 ^ OPENSSL_SA_BLOCK_BITS 21e1051a39Sopenharmony_ci SA_BLOCK_MASK = SA_BLOCK_MAX - 1 22e1051a39Sopenharmony_ci SA_BLOCK_MAX_LEVELS = number of bits in size_t divided by 23e1051a39Sopenharmony_ci OPENSSL_SA_BLOCK_BITS rounded up to the next multiple 24e1051a39Sopenharmony_ci of OPENSSL_SA_BLOCK_BITS 25e1051a39Sopenharmony_ci 26e1051a39Sopenharmony_ci`OPENSSL_SA_BLOCK_BITS` can be defined at compile time and this overrides the 27e1051a39Sopenharmony_cibuilt in setting. 28e1051a39Sopenharmony_ci 29e1051a39Sopenharmony_ciAs a space and performance optimisation, the height of the tree is usually 30e1051a39Sopenharmony_ciless than the maximum possible height. Only sufficient height is allocated to 31e1051a39Sopenharmony_ciaccommodate the largest index added to the data structure. 32e1051a39Sopenharmony_ci 33e1051a39Sopenharmony_ciThe largest index used to add a value to the array determines the tree height: 34e1051a39Sopenharmony_ci 35e1051a39Sopenharmony_ci +----------------------+---------------------+ 36e1051a39Sopenharmony_ci | Largest Added Index | Height of Tree | 37e1051a39Sopenharmony_ci +----------------------+---------------------+ 38e1051a39Sopenharmony_ci | SA_BLOCK_MAX - 1 | 1 | 39e1051a39Sopenharmony_ci | SA_BLOCK_MAX ^ 2 - 1 | 2 | 40e1051a39Sopenharmony_ci | SA_BLOCK_MAX ^ 3 - 1 | 3 | 41e1051a39Sopenharmony_ci | ... | ... | 42e1051a39Sopenharmony_ci | size_t max | SA_BLOCK_MAX_LEVELS | 43e1051a39Sopenharmony_ci +----------------------+---------------------+ 44e1051a39Sopenharmony_ci 45e1051a39Sopenharmony_ciThe tree height is dynamically increased as needed based on additions. 46e1051a39Sopenharmony_ci 47e1051a39Sopenharmony_ciAn empty tree is represented by a NULL root pointer. Inserting a value at 48e1051a39Sopenharmony_ciindex 0 results in the allocation of a top level node full of null pointers 49e1051a39Sopenharmony_ciexcept for the single pointer to the user's data (N = SA_BLOCK_MAX for 50e1051a39Sopenharmony_cibrevity): 51e1051a39Sopenharmony_ci 52e1051a39Sopenharmony_ci +----+ 53e1051a39Sopenharmony_ci |Root| 54e1051a39Sopenharmony_ci |Node| 55e1051a39Sopenharmony_ci +-+--+ 56e1051a39Sopenharmony_ci | 57e1051a39Sopenharmony_ci | 58e1051a39Sopenharmony_ci | 59e1051a39Sopenharmony_ci v 60e1051a39Sopenharmony_ci +-+-+---+---+---+---+ 61e1051a39Sopenharmony_ci | 0 | 1 | 2 |...|N-1| 62e1051a39Sopenharmony_ci | |nil|nil|...|nil| 63e1051a39Sopenharmony_ci +-+-+---+---+---+---+ 64e1051a39Sopenharmony_ci | 65e1051a39Sopenharmony_ci | 66e1051a39Sopenharmony_ci | 67e1051a39Sopenharmony_ci v 68e1051a39Sopenharmony_ci +-+--+ 69e1051a39Sopenharmony_ci |User| 70e1051a39Sopenharmony_ci |Data| 71e1051a39Sopenharmony_ci +----+ 72e1051a39Sopenharmony_ci Index 0 73e1051a39Sopenharmony_ci 74e1051a39Sopenharmony_ciInserting at element 2N+1 creates a new root node and pushes down the old root 75e1051a39Sopenharmony_cinode. It then creates a second second level node to hold the pointer to the 76e1051a39Sopenharmony_ciuser's new data: 77e1051a39Sopenharmony_ci 78e1051a39Sopenharmony_ci +----+ 79e1051a39Sopenharmony_ci |Root| 80e1051a39Sopenharmony_ci |Node| 81e1051a39Sopenharmony_ci +-+--+ 82e1051a39Sopenharmony_ci | 83e1051a39Sopenharmony_ci | 84e1051a39Sopenharmony_ci | 85e1051a39Sopenharmony_ci v 86e1051a39Sopenharmony_ci +-+-+---+---+---+---+ 87e1051a39Sopenharmony_ci | 0 | 1 | 2 |...|N-1| 88e1051a39Sopenharmony_ci | |nil| |...|nil| 89e1051a39Sopenharmony_ci +-+-+---+-+-+---+---+ 90e1051a39Sopenharmony_ci | | 91e1051a39Sopenharmony_ci | +------------------+ 92e1051a39Sopenharmony_ci | | 93e1051a39Sopenharmony_ci v v 94e1051a39Sopenharmony_ci +-+-+---+---+---+---+ +-+-+---+---+---+---+ 95e1051a39Sopenharmony_ci | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1| 96e1051a39Sopenharmony_ci |nil| |nil|...|nil| |nil| |nil|...|nil| 97e1051a39Sopenharmony_ci +-+-+---+---+---+---+ +---+-+-+---+---+---+ 98e1051a39Sopenharmony_ci | | 99e1051a39Sopenharmony_ci | | 100e1051a39Sopenharmony_ci | | 101e1051a39Sopenharmony_ci v v 102e1051a39Sopenharmony_ci +-+--+ +-+--+ 103e1051a39Sopenharmony_ci |User| |User| 104e1051a39Sopenharmony_ci |Data| |Data| 105e1051a39Sopenharmony_ci +----+ +----+ 106e1051a39Sopenharmony_ci Index 0 Index 2N+1 107e1051a39Sopenharmony_ci 108e1051a39Sopenharmony_ciThe nodes themselves are allocated in a sparse manner. Only nodes which exist 109e1051a39Sopenharmony_cialong a path from the root of the tree to an added leaf will be allocated. 110e1051a39Sopenharmony_ciThe complexity is hidden and nodes are allocated on an as needed basis. 111e1051a39Sopenharmony_ciBecause the data is expected to be sparse this doesn't result in a large waste 112e1051a39Sopenharmony_ciof space. 113e1051a39Sopenharmony_ci 114e1051a39Sopenharmony_ciValues can be removed from the sparse array by setting their index position to 115e1051a39Sopenharmony_ciNULL. The data structure does not attempt to reclaim nodes or reduce the 116e1051a39Sopenharmony_ciheight of the tree on removal. For example, now setting index 0 to NULL would 117e1051a39Sopenharmony_ciresult in: 118e1051a39Sopenharmony_ci 119e1051a39Sopenharmony_ci +----+ 120e1051a39Sopenharmony_ci |Root| 121e1051a39Sopenharmony_ci |Node| 122e1051a39Sopenharmony_ci +-+--+ 123e1051a39Sopenharmony_ci | 124e1051a39Sopenharmony_ci | 125e1051a39Sopenharmony_ci | 126e1051a39Sopenharmony_ci v 127e1051a39Sopenharmony_ci +-+-+---+---+---+---+ 128e1051a39Sopenharmony_ci | 0 | 1 | 2 |...|N-1| 129e1051a39Sopenharmony_ci | |nil| |...|nil| 130e1051a39Sopenharmony_ci +-+-+---+-+-+---+---+ 131e1051a39Sopenharmony_ci | | 132e1051a39Sopenharmony_ci | +------------------+ 133e1051a39Sopenharmony_ci | | 134e1051a39Sopenharmony_ci v v 135e1051a39Sopenharmony_ci +-+-+---+---+---+---+ +-+-+---+---+---+---+ 136e1051a39Sopenharmony_ci | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1| 137e1051a39Sopenharmony_ci |nil|nil|nil|...|nil| |nil| |nil|...|nil| 138e1051a39Sopenharmony_ci +---+---+---+---+---+ +---+-+-+---+---+---+ 139e1051a39Sopenharmony_ci | 140e1051a39Sopenharmony_ci | 141e1051a39Sopenharmony_ci | 142e1051a39Sopenharmony_ci v 143e1051a39Sopenharmony_ci +-+--+ 144e1051a39Sopenharmony_ci |User| 145e1051a39Sopenharmony_ci |Data| 146e1051a39Sopenharmony_ci +----+ 147e1051a39Sopenharmony_ci Index 2N+1 148e1051a39Sopenharmony_ci 149e1051a39Sopenharmony_ciAccesses to elements in the sparse array take O(log n) time where n is the 150e1051a39Sopenharmony_cilargest element. The base of the logarithm is `SA_BLOCK_MAX`, so for moderately 151e1051a39Sopenharmony_cismall indices (e.g. NIDs), single level (constant time) access is achievable. 152e1051a39Sopenharmony_ciSpace usage is O(minimum(m, n log(n)) where m is the number of elements in the 153e1051a39Sopenharmony_ciarray. 154e1051a39Sopenharmony_ci 155e1051a39Sopenharmony_ciNote: sparse arrays only include pointers to types. 156e1051a39Sopenharmony_ciThus, `SPARSE_ARRAY_OF(char)` can be used to store a string. 157