Comparing Different Ways to Update a List
I have been working on building a workflow at work. One of the optimisation points I was considering was updating the elements in an immutable List. Things are certainly much more manageable when dealing with immutable data, but sometimes It brings additional challenges. I implemented the first solution that came to my mind at that point. Now I want to experiment with some other alternative ones.
Setup
Let’s start with building the script. My use case is that I have a list of containers that could carry different data types. Storage does not need to know the kind of data I am updating. The goal is to have a list implementation that I wouldn’t need to worry about too much.
We could define the Data as simply as possible. Right now, all the payload types are strings.
This data class lets us define the list we are working on and the number of update operations we need to execute on the list. For a rough benchmark, the initial parameters will be:
We also need a way to define the experiments and print the results.
The experimentBlock
scope gives us a way to implement the method with access to ExperimentData
. We then run the
block and print out the duration it took to run the whole thing. There might be better ways to measure the performance.
But for the scope of this experiment, it is enough.
Lastly, we will add each experiment to an empty Experiment list and run them using the above dataset.
Methods
The first method I thought of was to use the map to iterate over the list elements and copy the data once I found the correct id.
This works. But we traverse the whole list on every update. The performance was not bad, but we could do it better.
What if we find the target element, update it with the new contents, and then build a new version of the list? The next approach does that.
finding the element and updating it is done by accessing with index. I built the list by combining all the parts. I
divided the head and tail sections of the list using subList
method. We improved the performance by 25%.
Accessing the element and changing the data is okay. But we are using two subList operations while we are building the
new version of the list. Since the data is immutable, we can’t just change the desired element. However, we can use
toMutableList()
method to cast the list to a mutable one, make the update, and cast it to an immutable version again
by using the toList()
extension function.
This approach improved the performance significantly.
At this point, I was happy with this solution. But the developer in me wanted a bit more tidiness in the code, so I wrote an extension function to make this update part a bit cleaner.
I wanted to see whether this version improves the performance or not:
And it did improve by almost 50%.
As far as I can see, the Kotlin ByteCode can be optimised a bit more because we moved the update operation to an extension function. I don’t know if that’s the actual reason but the further examination is due for another time.
Lastly, I want to compare the four methods with different datasets. First, let’s increase the listSize:
Methods are behaving similarly.
For my use case, listSize probably will not be greater than 100, but the updateTimes can be high so we tweak the dataset like this:
Converting the immutable list to a mutable one and updating the element using an extension function is the most performant one among the other ones. It is a small section of the overall problem I am working on but the Kotlin language makes it fun to explore different options and comparing them. You can find the complete script from here.