I was a junior high school student when I first met the word "entropy". In a science book I was reading, it is written that "A living being lives on negative entropy". I could not fully understand what entropy was, but the mysterious word has existed in my mind until a university lecture on thermodynamics. In the lecture, Prof. Sasa explained the definition of entropy and the second law of thermodynamics, that is, entropy never decreases. Furthermore he said the law is an empirical rule which cannot be proven. After the lecture, I still didn't fully understand entropy, but realized that I want to know its nature and origin more deeply and understand what he said in the lecture.
A person learns intensively and remembers efficiently what he is interested in. Then, the most effective way to teach something is to get him interested in it. I think that the most important thing which a professor of a university should do in his lecture is not to teach many equations but to show how wonderful or useful the subject is. Of course, writing down many equations on a blackboard is one effective way to show its depth, but the aim should be to get students interested, not to teach them.
Entropy is a physical quantity defined in several ways, and can be interpreted in many ways. The most intuitive interpretation is a degree of mess. Well mixed milk coffee has bigger entropy than separated milk and coffee. A state of less entropy naturally changes to a state of bigger entropy, but the opposite change seldom happens. This is the second law of thermodynamics which states, entropy seldom decreases. This is a tendency rather than a law, because strictly speaking, the separation of milk coffee into individual ingredients doesn't violate any law of physics like Newton's law of dynamics, nor quantum mechanics. It is just that the probability of such phenomena is ignorable.
Why does entropy tend to increase? To answer this question, you have to understand a more precise definition of entropy. Imagine a dice on a quickly vibrating table which often upsets the dice. Normally you can distinguish the six different pips. But suppose you identify pips of 1 to 5 as a state "NOT6", or imagine a dice whose 5 surfaces are white and the other one has 6 pips. Entropy is defined like a number of identified states(*1). So the entropy of "NOT6" which has more number of states and so has higher probability to occur, is larger than that of state "6". When the pip is "6", it is very likely to get a "NOT6" in the next turn. On the other hand, a probability to return to "6" is low (but not zero). Entropy tends to increase because state of large entropy occurs very often.
How much bigger is the entropy of mixed milk coffee than the sum of individual milk and coffee? Ether mixed or not, the number of each molecule which composes them is the same. The difference is their positions, density distribution like patterns of floating milk. A right-handed spiral, a left-handed spiral, and a spotted pattern, etc. can be distinguished. But the homogeneously well-mixed milk coffee which in fact includes a much higher number of position states, in other words, of larger entropy than the sum of each, are indistinguishable and identified. A well-mixed state includes so many states that its probability of changing back to a non-mixed state is infinitesimal, like a dice on a table with a huge number of surfaces and only a few distinct surfaces.(*2)
The above statement assumes that every state, ether distinguishable or not, happens with an equal probability, like a dice. This assumption is called "principle of equal a-priori probability", and cannot be proven. This is why the professor said that an increase of entropy is an empirical rule. Because by assuming that principle, most of thermodynamic phenomena can be explained very well, the principle is considered to be valid. It can be said that entropy results from identification. If you had special eyes which could distinguish every position of the molecules of well-mixed milk coffee, entropy would be zero for all states and would lose its meanings.
*1: Precisely, entropy S is defined as S = k log(W) where k is Boltzmann's constant and W is the number of identified states. A state of larger S contains states of a higher order number.
*2: Instead of milk coffee, thermal diffusion also can be explained by entropy increase. In this case, you should use a phase space of location and velocity of molecules. In a cold state, most states concentrates around the center of phase space. Such states contains smaller number of states than a hot state in which they spread over wide area of phase space. When a hot object touches a cold one, just like separated milk and coffee, the inhomogeneous distribution will be mixed. That means heat moves from the hot to the cold so as to increase total entropy.