i think the main reason is power draw by internal structure. if you compare chip design of cpu dan gpu the main different is cpu built on complex part, alu, fpu, data/instr ctrl, sse, cache, etc that very different structure each other so each part take different power too. with carefull trim of design each part, can help reduce total power and raise clock. gpu is massive pararel design. most structure of gpu is the same too. so if one part has take amount power, all part will be same too. so you can only have limited trims of gpu power structure. dont forget about wide memory line, that take much power too. so the real challenge of gpu design is make pararel part (shader engine mostly), draw power as little as possible to make sure they can put pararel part as much as possible because its straigh related to performance.
btw, shader engine on nvidia has break 1 GHz limit.... now its nearly 2 GHz

, that's clever design make ati failed with their massive shader engine in real game performance.