Hello Rakhi
This is just a matter of design. OSPF is a link state routing protocol, which means that each router participating in OSPF must have a complete topology of the whole network including all of the participating OSPF routers. If an OSPF area grows too large, it will use many resources (CPU, memory etc) on the routers and it will slow down in its convergence time.
In order to make OSPF scalable to much larger networks, the concept of areas was introduced. As OSPF was being developed, much research went into its multi-area design. Based on the tests that engineers performed, they found that they can ensure a satisfactory convergence time only if all non-backbone areas are directly connected to area 0.
It is for this reason that one of the prerequisites of the creation of non-backbone OSPF areas is that they must be directly connected to the backbone area 0.
As you may know, it is possible to “trick” an area into believing it is directly connected to the backbone area even if it isn’t This is done using a virtual link, and you can find out more information about that here:
The virtual link was introduced only as a temporary measure and should not be used as a permanent solution. It is often useful when companies merge, and their networks must be interconnected. Often such mergings require temporary network topologies that may result in non-backbone networks not connecting directly to the backbone. Until more permanent network changes can be made, a virtual link can be a “quick fix” temporary solution.
So to summarize, OSPF requires non-backbone areas to be directly connected to area 0 in order to ensure the best possible performance of OSPF, and this is based on the fundamental design of the protocol itself.
I hope this has been helpful!
Laz