Post

Comparing Different Ways to Update a List

I have been working on building a workflow at work. One of the optimisation points I was considering was updating the elements in an immutable List. Things are certainly much more manageable when dealing with immutable data, but sometimes It brings additional challenges. I implemented the first solution that came to my mind at that point. Now I want to experiment with some other alternative ones.

Setup

Let’s start with building the script. My use case is that I have a list of containers that could carry different data types. Storage does not need to know the kind of data I am updating. The goal is to have a list implementation that I wouldn’t need to worry about too much.

1
2
3
4
data class Container(  
    val id: Int,    
    val payload: String = ""  
)

We could define the Data as simply as possible. Right now, all the payload types are strings.

1
2
3
4
5
6
7
8
9
data class ExperimentData(  
    private val listSize: Int,  
    private val updateTimes: Int  
) {  
    val componentList = (0..listSize).map { Container(id = it, order = it.inc()) }  
    val updateList = buildList {  
        repeat(updateTimes) { add(Random.nextInt(0..listSize)) }  
    }.toList()  
}

This data class lets us define the list we are working on and the number of update operations we need to execute on the list. For a rough benchmark, the initial parameters will be:

1
val dataset = ExperimentData(listSize = 300, updateTimes = 50000)

We also need a way to define the experiments and print the results.

1
2
3
4
5
6
7
8
9
10
data class Experiment(  
    val name: String,  
    val experimentBlock: (ExperimentData) -> Unit  
) {  
    fun runWith(data: ExperimentData) {  
        print("\nMethod: $name ")  
        val duration = measureTime { experimentBlock(data) }  
        print("| Duration: ${duration.absoluteValue}\n")  
    }  
}

The experimentBlock scope gives us a way to implement the method with access to ExperimentData. We then run the block and print out the duration it took to run the whole thing. There might be better ways to measure the performance. But for the scope of this experiment, it is enough.

Lastly, we will add each experiment to an empty Experiment list and run them using the above dataset.

Methods

The first method I thought of was to use the map to iterate over the list elements and copy the data once I found the correct id.

1
2
3
4
5
6
7
8
Experiment(name = "Map") { data ->  
    var componentList = data.componentList  
    data.updateList.forEach { index ->  
        componentList = componentList.map {  
            if (it.id == index) it.copy(payload = "Updated") else it  
        }    
	}
}

This works. But we traverse the whole list on every update. The performance was not bad, but we could do it better.

1
Method: Map | Duration: 85.074458ms

What if we find the target element, update it with the new contents, and then build a new version of the list? The next approach does that.

1
2
3
4
5
6
7
8
9
10
11
Experiment(name = "Sublist") { data ->  
    var componentList = data.componentList  
    data.updateList.forEach { index ->  
        val targetComponent = componentList[index]  
        componentList = buildList<Container> {  
            addAll(componentList.subList(0,index))  
            add(targetComponent.copy(payload = "Updated"))  
            addAll(componentList.subList(index, componentList.lastIndex))  
        }  
    }
}

finding the element and updating it is done by accessing with index. I built the list by combining all the parts. I divided the head and tail sections of the list using subList method. We improved the performance by 25%.

Accessing the element and changing the data is okay. But we are using two subList operations while we are building the new version of the list. Since the data is immutable, we can’t just change the desired element. However, we can use toMutableList() method to cast the list to a mutable one, make the update, and cast it to an immutable version again by using the toList() extension function.

1
2
3
4
5
6
7
8
Experiment(name = "MutableList") { data ->  
    var componentList = data.componentList  
    data.updateList.forEach { index ->  
        componentList = componentList.toMutableList().apply {  
            this[index] = this[index].copy(payload = "Updated")  
        }.toList()  
    }  
}

This approach improved the performance significantly.

1
Method: MutableList | Duration: 22.168542ms

At this point, I was happy with this solution. But the developer in me wanted a bit more tidiness in the code, so I wrote an extension function to make this update part a bit cleaner.

1
2
3
fun <T> List<T>.update(index: Int, block: (T) -> (T)): List<T> {  
    return this.toMutableList().apply { this[index] = block(this[index]) }.toList()  
}

I wanted to see whether this version improves the performance or not:

1
2
3
4
5
6
7
8
Experiment(name = "MutableList with extension") { data ->  
    var componentList = data.componentList  
    data.updateList.forEach { index ->  
        componentList = componentList.update(index) { 
            it.copy(payload = "Updated")
        }  
    }
}

And it did improve by almost 50%.

1
Method: MutableList with extension | Duration: 12.846166ms

As far as I can see, the Kotlin ByteCode can be optimised a bit more because we moved the update operation to an extension function. I don’t know if that’s the actual reason but the further examination is due for another time.

Lastly, I want to compare the four methods with different datasets. First, let’s increase the listSize:

1
val dataset = ExperimentData(listSize = 30000, updateTimes = 50000)

Methods are behaving similarly.

1
2
3
4
- Method: Map | Duration: 7.168025542s
- Method: Sublist | Duration: 3.607699917s
- Method: MutableList | Duration: 752.886ms
- Method: MutableList with extension | Duration: 650.442875ms

For my use case, listSize probably will not be greater than 100, but the updateTimes can be high so we tweak the dataset like this:

1
val dataset = ExperimentData(listSize = 100, updateTimes = 5000000)
1
2
3
4
- Method: Map | Duration: 2.433019792s
- Method: Sublist | Duration: 1.489141333s
- Method: MutableList | Duration: 195.430875ms
- Method: MutableList with extension | Duration: 180.124916ms

Converting the immutable list to a mutable one and updating the element using an extension function is the most performant one among the other ones. It is a small section of the overall problem I am working on but the Kotlin language makes it fun to explore different options and comparing them. You can find the complete script from here.

This post is licensed under CC BY 4.0 by the author.