Yay, theory time!
anomalousdecay is right. The physical limits in question are mostly related to transistor size and density. You can only pack so many transistors onto a chip without causing massive overheating (the kind that can't be mitigated by fans / water / other cooling devices).
There is also another important factor when it comes to multicore processing. The reason why have multiple cores at all is that
a core can only do one thing at a time. But if you're running a modern OS on a single core computer, you can still be watching a video in your browser and messaging friends and editing a word document at the same time. To give the impression of multitasking, what that single core is actually doing is switching between all the tasks really quickly. So it spends a few microseconds rendering the next video frame, a few microseconds getting input from the keyboard into the word document, etc. This all happens so fast that for the user it seems like everything is happening simultaneously. This process is known as
time multiplexing and it's controlled by the operating system's
scheduler. Each process has a
thread and the operating system alternates rapidly between all the different threads. When your computer freezes, that's probably because a thread misbehaved and stopped the scheduler from interrupting it and switching to another thread.
When you add another core, you can now do things simultaneously for real! You just multiplex across
both cores. So your computer's speed should double, right?
But what happens if, say, MS Word is in the process of reading input from your keyboard at the moment when the scheduler switches to another thread? MS Word's thread will get deactivated and your keystroke might never appear in the document. Reading input from the keyboard is a simple example of a
critical section of code: code that accesses a resource shared between multiple threads (e.g. a keyboard or a file) and can't be interrupted while doing so. Importantly,
critical sections can't be parallelised - at all. Even if you have two cores, if a thread on one core is accessing input from your keyboard, the thread on the other core has to wait until it is allowed to access your keyboard.
Let's say all the threads running on your computer consist of 30% critical sections (that must run in serial) and 70% parallelisable sections. You can keep adding cores to your computer and the 70% of parallelisable code can be shared among those cores -- but every time a critical section runs, everything else still has to stop and wait for it to finish. So as the number of cores -> infinity, the speed of your computer is
still bounded by the 30% of critical code. This is an application of what's known as Amdahl's law.
So you can't just keep adding cores to keep speeding up your computer. A better use of your time is to work out how to reduce the amount of critical code in your programs. This is becoming a hugely important part of computing nowadays.