# How to find an element in a linked list of blocks (containing n elements) as fast as possible?

My data structure is a linked list of blocks. A block contains 31 elements of 4 byte and one 4 byte pointer to the next block or NULL(in summary 128 bytes per block). I add elements from time to time. If the last block is full, I add another block via pointer.

One objective is to use as less memory (= blocks) as possible and having no free space between two elements in a block.

This setting is fix. All code runs on a 32-bit ARM Cortex-A8 CPU with NEON pipeline.

Question: How to find a specific element in that data structure as quickly as possible?

Approach (right now): I use sorted blocks and binary search to check for an element (9 bit of the 4 byte are the search criteria). If the desired element is not in the current block I jump to the next block. If the element is not in the last block and the last block is not yet full, I use the result of the binary search to insert the new element (if necessary I make space using memmove within this block). Thus all blocks are always sorted.

Do you have an idea to make that faster? This is how I search right now: (q->getPosition() is an inline function that just extracts the 9-bit position from the element via "& bitmask")

```do
{
// binary search algorithm (bsearch)
// p?hl=en#qoCVjtE_vOw/gcc4/trunk/gcc-
// 4.4.3/libiberty/bsearch.c&q=bsearch&sa=N&cd=2&ct=rc

base = &(block->points[0]);

if (block->next == NULL)
{
pointsInBlock = pointsInLastBlock;
stop = true;
}
else
{
block = block->next;
}

for (lim = pointsInBlock; lim != 0; lim >>= 1)
{
q = base + (lim >> 1);

cmp = quantizedPosition - q->getPosition();

if (cmp > 0)
{
// quantizedPosition > q: move right
base = q + 1;
lim--;
}
else if (cmp == 0)
{
// We found the QuantPoint
*outQuantPoint = q;
return true;
}
// else move left
}
}
while (!stop);
```

Since the bulk of the time is spent in the within-block search, that needs to be as fast as possible. Since the number of elements is fixed, you can completely unroll that loop, as in:

```if (key < a[16]){
if (key < a[8]){
...
}
else { // key >= a[8] && key < a[16]
...
}
}
else { // key >= a[16]
if (key < a[24]){
...
}
else { // key >= a[24]
...
}
}
```

Study the generated assembly language and single-step it in a debugger, to make sure the compiler's giving you good code.

You might want to write a little program to print out the above code, as it will be hard to write by hand, or possibly generate it with macros.

ADDED: Just noticed your 9-bit search criterion. In that case, just pre-allocate an array of 512 4-byte words, and index it directly. That's the fastest, and the least code.

ALSO ADDED: If you need to keep your block structure, there's another way to do the unrolled binary search. It's the Jon Bentley method:

```i = 0;
if (key >= a[i+16]) i += 16;
if (key >= a[i+ 8]) i +=  8;
if (key >= a[i+ 4]) i +=  4;
if (key >= a[i+ 2]) i +=  2;
if (i < 30 && key >= a[i+ 1]) i +=  1; // this excludes 31
if (key == a[i]) // then key is found
```

That's slower than the if-tree above, because of manipulating i, but could be substantially less code.