Thanks for putting this analysis together Benjamin, I've gone through the details you put together as well as the github repository you linked.
As you noted in the description, some of the differences are down to extra features like time zone/time system/calendar handling (Calendrical Calculations is actually the basis of a number of the in-built algorithms for some of the supported calendars...which you may have already guessed). Things like DayCountConvention and HolidayCalendar also add compexity to arithmetic operations.
..but having said that the most common cases for these are still substantially slower than they need to be, which is a particular problem for arrays of dates. Optimization is an on-going task, but one we hope to have greatly improved in the coming release, as well support in future features like dedicated date arrays for storing and operating on large collections of dates at once.
If you have any additional questions or feedback please feel free to note it here, or contact me directly, and I'll be happy to talk in more detail.
-Nick Lariviere
Kernel Developer at Wolfram Research Inc