Photo by Filip Zrnzević on Unsplash

Exploration vs Exploitation. Which way to go?

We are all trying to keep our lives in place. As we grow older, it becomes more important to cultivate some more stability and less risk, partially due to extra responsibilities we get from our surrounding and from the system itself. There is a observable transition from child’s irresponsible wandering to adult’s tediously planned next steps.

We actually have a very extended childhood compared to our hominins. Chimpanzees already become self-sufficient around 7 years old, whereas for humans it starts at least around the age of 15 (nowadays in most countries full time working age is accepted to start at 16). Homosapiens adults are obliged to feed and protect their children until they can do it themselves independently. Thus they need to make sure their child’s every next step is planned to be safe. Child then is completely free to do what it wants, thanks to the parents. With the contraction of protection and feeding from the parents as the child grows older, the young one limits the risk of extincting due to random and dangerous acts such as petting a crocodile or going with a stranger to buy an ice cream.

We learn safe ways of living, having fun and working. We try to avoid anything that is risky. However, this extreme cautiousness sometimes traps us into a local minima, thus our growth stops. The opposite of the focus on the short term stability and continuous gains is maximizing the total rewards in long term with the expectation of short-term volatility. It is a choice between exploration versus exploitation.

When we talk about decision-making, we usually focus just on the immediate payoff of a single decision — and if you treat every decision as if it were your last, then indeed only exploitation makes sense. But over a lifetime, you’re going to make a lot of decisions. And it’s actually rational to emphasize exploration — the new rather than the best, the exciting rather than the safe, the random rather than the considered — for many of those choices, particularly earlier in life.

Algorithms to live by

Exploitation is used when the wish is to maximize gains wihtin the near future. Whereas in exploration mode we are focusing on maximizing the long-run rewards constrained by uncertain and potentially risky future. So, which one should we adopt? Playing at the safest and fast rewarding mode, or the one where we have an open end gaining and losing potential on long-run. This is as well an attractive topic in mathematics and AI research, known as “multi-arm bandit” problem. The multi-arm bandit problem tries to find the most optimal strategy to exploit with pokies by putting the focus on making a choice between trying different machines to test them out (exploration) or staying loyal to the most promising machine we have already seen (exploitation).

For me, looking at life, there are many things that are uncertain. There is no possibility of running away from it. Therefore, we should not avoid experimenting new things, with as much variations as we can. As we learn more about the gains and risks by trying it out and as we grow more certainty for that option, we can increase the weight of that choice gradually. However, we should always look for new things still, because we might be stuck in a local minima unknowingly where further exploration at the risk of decreasing gains or increasing costs in short term could’ve put us onto more favorable position.

A playful mind is inquisitive, and learning is fun. If you indulge your natural curiosity and retain a sense of fun in new experience, I think you’ll find it functions as a sort of shock absorber for the bumpy road ahead.

Bill Watterson

Fatigue and ennui are probably evolutionary-evolved triggers that help us to shift from exploit to explore mode. Experimenting helps us to learn more, to gain more information and knowledge, increase certainty by finding multiple variables, and thus provides a better chance in following the best outcome given findings.

To be successful in an environment with such a dynamic shifts, organisations and individuals should adopt the experimentation, exploration and continuous innovation part of their beliefs, cultures and mindsets. Instead of focusing on return on investment (ROI) at exploration phase, we should focus on decreasing the uncertainty of our path on long-term vision.

We all have our own dilemmas in every part of our lives, both personal and professional. The desires, needs, constraints, opportunities or obstacles is randomly distributed onto all stages of our lives, which we have no knowledge about in advance. However, if we do not explore and discover what we want to be and what we want to do, our life will be dull and dissatisfying. We should always look for new, innovative and creative ways to live our lives.


Read more on the topic…

Journey on managing a product and lessons learned

I had a very weird dream yesterday. Inside the dream I was still in bed trying to sleep, but there were some numbers and charts flying around and screaming to me. It sounds more of a nightmare, I know. It reminds of the book, “The Phoenix Project” by Gene, Kevin and George. However, it was…

On Investing and Wealthy Life

With the Neolithic Revolution, around 12,000 years ago, several nomadic bands shifted from being hunter-gatherers to larger agrarian settlements. Along with this shift, communities started to establish private property rights, as well store and exchange value. The latter one brought us to invent money, turn it into a virtual value and speculate based on ignorance…

Reflecting on sleeping and learning

Sleeping less than six hours a day increases the risk of having certain forms of cancer, Alzheimer’s disease, inconsistency of blood sugar, blockage of coronary arteries, psychiatric diseases such as depression, anxiety, suicidality, obesity and some other things. Sleep keeps our body’s metabolic state in balance by fine-tuning the insulin and circulating glucose. A good…

Bullet-proof productivity with OKRs

Back in late 1970s Andy Grove, in Intel, introduced the OKR (Objectives & Key Results), inspired by Peter Drucker’s MBO (Management by Objective) method. OKR is a goal-setting method used to define strategic goals and measurable results used by organizations. OKRs help to build a direction and prioritized focus. As of today there are many…

Reflecting on Passion and Perseverance

Self-reflection is a vital activity for steering a good life. It helps us understand ourselves, control out attitude towards life, build resilience to adversities and make solid plans. There is also another important point to understand: there is no meaning on what we do in the future. Whether we work in a company, get married…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s