Sunday 6 August 2006

Bye bye Visual Basic

Well, its been less than two days since the MacBU announced that Visual Basic is being removed from the next version of Mac Office. The news has created quite a firestorm on many Mac forums (I’ve been scanning MacNN, Ars Technica, and a few others) and I received some very strongly expressed opinions about it in comments on yesterday’s post. I’d like to take some time to express my own views and experiences on the removal of Mac VB.

I should clear up one misconception about how the VB removal affects existing macros that has been making the blog and comment rounds. The removal of VB means that existing macros in Office documents will be round-tripped across file open and save, but you will not be able to edit them and you will not be able to run them on the Mac. Even previously compiled macros will not execute, because they have been compiled to PowerPC code that conforms to an older binary interface.

I want to say right up front that the MacBU is very aware of the pain this decision will cause for users, consultants, and enterprise organizations. I’ve personally seen the phrases “apoplectic with rage” and “absolutely livid” in two emails that crossed my inbox. Some people made comments on my post yesterday that were expressly clear about how this decision would drive them to one of the free Open Office variants instead of buying Mac Office 12, and other posts in other forums made similar statements. I’m sure some people will indeed decide that lack of VB is an absolute deal-breaker and they will plan to use other software. I’m truly sorry if that is the case for you.

The MacBU did not make this decision lightly. I personally spent several weeks digging into the VB source code to identify and plan what work would have to be done to move it to Xcode and make it universal, and I had several long discussions with our product planning folks to help our group leadership weigh the costs of doing the VB port vs. the costs of not doing it. I’ll try to lead you through some of the analysis here.

From my perspective, Mac Office has two primary driving requirements:

  • it must be as Mac-like as possible, use Mac features, and take advantage of the Mac operating system, and
  • it must be as compatible with Win Office as possible, and share as many features and commonalities as it can.

(We’ve got other requirements and product visions, but as I see it, they really act to refine these two basic needs.) As you may imagine, these two goals are many times not perfectly aligned. In the worst cases, they may actually be diametrically opposed, and we have to wrestle with making the best decision we can, knowing full well that whichever way we go it will help some users and hurt others. This VB decision is one where we’re truly caught between the Mac rock and the Win Office hard place.

VB on the Mac exists for cross-platform compatibility. There is no other software on the Mac that also uses VB, so it doesn’t help Mac Office integrate with other workflows based purely on Apple solutions. Thus, any work we do on VB only serves to satisfy one of the two major requirements. Doing that work then means we have less developer time to continue to improve Mac Office’s use of Apple-specific technologies (or tools, such as Xcode.)

Let me describe for you some of the technical challenges that would be involved were we to try to port VB to Xcode and to the Intel platform. For those of you reading who are not developers, bear with me for a little bit. Hopefully you’ll at least get a sense of the scope of work even if you don’t quite follow the nitty-gritty details.

VB on the Mac is really three parts: VBE (the editor), VBA (the execution engine) and Forms (the buildable windows and controls you edit in VBE and see when running a macro.)

VBE is pretty standard C++ code. However, the code is generally very old — it was originally designed and written several years before I came to Microsoft in 1996. VBE contains the top-level parser that converts the text of a macro into a series of mostly machine-independent opcodes (kind of like Java bytecodes, but not exactly the same). Thus you can’t just hook an external text editor up to VBA, because of the upper-level dependency. The VBE code actually isn’t too hard to port to Intel, but it is tricky to port to Xcode/GCC because of the age of the code. As I mentioned in an earlier post, GCC is very picky about code meeting the current standards and the VBE code most certainly does not. That’s not to say the code is ‘bad,’ it was just designed and written long before current modern C++ standards.

VBA, on the other hand, is incredibly difficult to port to Intel. The execution engine basically runs through the previously mentioned opcodes and, well, executes them. The hard part is that ‘executing’ them doesn’t mean interpreting them, it means converting one or more at a time into a block of assembly generated at runtime that looks and behaves like a regular function that can be called directly by other normally compiled code. This is in essense ’self-creating’ code, and VBA is constantly flushing the CPU’s code cache in order to mark these chunks of data as executable. VBA’s generated code must adhere to the Application Binary Interface of the host platform (historically PowerPC and the Code Fragment Manager). This means register allocation, stack alignment, parameter passing locations, etc. VBA is basically a compiler that emits code at runtime. It does so by running a large state machine that tracks PPC register usage, stack location, mapping between PPC registers and VB variables, etc and then concatenates large blocks of pre-generated assembly together. VBA subsequently tweaks the assembly bit-field by bit-field to do things like assign registers to each opcode, set branch addresses, and create transition vectors for all function calls. The templates are very PPC- and CFM-specific and the state machine is designed for architectures that allocate static stack frames and pass parameters by register, unlike Intel which has dynamic stack frames (you can push and pop data to/from the stack any time you want) and parameters are passed on the stack. So, for us to port this to Intel we’d have to rewrite the entire state machine and create brand-new templates of IA-32 code. That’s basically writing a rudimentary compiler almost from scratch (we’d at least have the initial parsing and machine-independent opcodes already done.) Again, this is all a design that long predates me or most of my peers in Mac Office, and is code that we inherited when we created the MacBU (i.e, none of us wrote it in the first place.) There’s nothing inherently bad about the code, it was just designed for the constraints of the day and that design simply doesn’t lend itself to being architecture-independent.

Some folks might ask why not just port the Win Office VBA over to the Mac? Well, VBA circa Win Office 97 (which is the closest Windows VBA to what we have on the Mac) doesn’t implement their execution engine this way at all. Instead, they have tens of thousands of lines of IA-32 assembly that directly implements all of the opcodes. That assembly does so according to the Windows Intel ABI, which is different from the Mac ABI in several important ways (the specifics of which are described here.) Also, the assembly is in MASM format which is close to but not the same as NASM as supported by GCC. So, we’d have to edit the source to be compilable by GCC, and scrub it line-by-line to find and adjust the parts that aren’t compliant with the Apple Intel ABI. We’d also end up with two completely different implementations of VBA (PPC state machine and Intel straight assembly) that we’d have to maintain and keep in sync. That would be horribly bug-prone.

Lastly, we have Forms. Forms is also C++, but is backed by several thousand lines of gnarly custom assembly. This assembly ‘allows’ the C++ code to swap object virtual function tables and individual member function pointers between objects on the fly, to essentially do very fast object morphing. To do so, the assembly has to have specific knowledge of aspects of the C++ compiler (vtable layout, implementation of ptrs-to-member-functions, etc) and has to work in lockstep with the compiler. I spent almost two weeks massaging this code to try to make it compatible with just the PPC Mach ABI, which is only slightly different from the PPC CFM ABI. Even after all that work, I still didn’t get it completely right and internal builds had some really bad stability problems. We also don’t even have the Win Office 97 Forms source code, so I was not able to compare our code to how it was implemented for Windows.

I just noted that the assembly has to work hand-in-hand with the normal C/C++ compiler. That wasn̢۪t too much of a problem when we were using CodeWarrior, as the C++ compiler only changed in small ways every few years or so. With Xcode and GCC, my understanding is that Apple has to merge in all the changes that external developers commit to GCC, and we run the risk of GCC changing much more frequently. That might not be a problem in reality, but the risk is non-zero and we have to take that into account.

One final problem is that all of this custom assembly is currently PPC 32-bit, and even the corresponding Windows assembly is Intel 32-bit. If we ever want to make a 64-bit native version of Office, any work we might do to solve all of the above problems would have to be done all over again.

So, in short: VB has lots of code and assembly that specifically assumes it is running on a PPC with the Code Fragment Manager, and to re-do it for Intel would involve writing a rudimentary compiler and relying on private compiler implementations that are subject to change at any time.

Whew, that’s a lot of technical stuff. I hope it provides some idea of the scope of work we were facing. We estimated that it would take roughly two years to of development time to move it all over to Xcode and to Intel. That would mean two more years before the next version of Mac Office made its way to consumers. In the meantime, Leopard will ship and Mac Office 2004 would still be running in Rosetta. Win Office 2007 and the new XML file formats will be ever more common. All Mac Office users would still be stuck with the old formats, unable to share in or use the great expansion of capabilities these new file formats bring. During that time, we’d also not be adding any other items our users have asked for.

Beyond that, if we were to port VB over to Intel in those two years, what you’d end up with is VB for Mac just as it is today. It still wouldn’t be feature-comparable to VB in Win Office, and the object model in Mac Office would still not be the same as the one in Win Office. That means that your macros would still be restricted to the same set of compatible items as you have today. Over the last 10 years, the Win Office programming model has become very different from that of Mac Office. We’ve tried to keep the object models in sync for the features that we have ported from Win Office, but we haven’t ported everything.

So, given that the developer cost was huge, that the consumer cost due to the delay while we did the work was quite large, and that the end result would be no better than what we have today, we made the very difficult decision to invest our time and resources in the other pillar of Mac Office, namely taking advantage of Apple tools and technologies to be more ‘Mac-like’. We’ve continued to improve the AppleScriptability of our apps (many many bug fixes post-Office-2004) and as announced are looking into adding some Automator actions to the suite. We’ve completed the rest of our transition to Xcode and to Intel and are forging ahead with the rest of the product.

I think a common question might be ‘if the cost is so huge, why doesn’t Microsoft just devote more resources to the problem? They’ve got a ton of cash, right?’ Well, the real question is ‘what resources do you throw at the problem?’ We’ve been working very hard to hire a bunch of developers, but it has turned out to be quite difficult to fill our existing open headcount positions. As an example, I’ve had an open position on my own team for 9 of the last 12 months (it took 8 months to fill the slot when one developer moved from my team to another one in MacBU, and only last week did we hire someone to fill the slot vacated recently when another developer moved to a different team at Microsoft.) The question of how Microsoft allocates developer headcount and funding to MacBU is a separate topic of its own which hopefully I or some other MacBU blogger will tackle later. In any case, there’s no point in adding new headcount to the MacBU when we haven’t yet filled the positions we already have open.

I know that explaining all this doesn’t make the fact of VB’s death any easier for those users who currently depend on it. As I said at the beginning, we in the MacBU really are aware of the difficulties you face. Our product planners, program managers, developers, and testers are working to alleviate some of that pain. Many people have only a few simple macros they use, and I do want to point out that those macros will translate very easily into AppleScript. Even large macros can be rewritten in AppleScript, although that takes some time and definitely some knowledge scripting on the Mac. The AppleScript object model and the old VB object model for our apps are roughly equivalent, so apart from the syntactical differences, if you could do it in VB you can do it in AppleScript. While I can’t comment on any more specific feature work for Office 12, I’m sure we will be working closely with enterprise customers to help them address their concerns. We’ll be saying more about our scripting plans as we get closer to the product release for Office 12.

For those of you contemplating a switch to Open Office, I don’t know if Open Office has any support for VB macros or other OLE Automation technologies so I don’t know if you’ll be any better off from a cross-platform perspective. You probably can’t be worse-off except that Open Office certainly doesn’t support any of the Mac OS scripting technologies that Mac Office does support and in which we will continue to invest, nor will it (at least for a while yet) support the new XML-based file formats. If you do switch, we’ll miss you.

Many people have viewed this announcement by MacBU as a sign that we are out to screw the Mac community, or that we’re just looking for an exit strategy. We’re not. Most empatically, we’re not. This decision was agonizing. My manager even said he felt ’sick about the impact on those who really rely on xplat [cross-platform] VB support, particularly in Excel where we see it the most.’ In my post yesterday, I said that I wasn’t so sad to see VB go. I said that from the perspective of a developer who’s worked to maintain the code for many years. However, there’s nothing good about removing a feature that many people rely on, except that it frees up resources for us to invest more heavily in other important areas. Due to the age of the code, VB has been a very large drain on our resources for a long time with relatively little return. A couple of months ago I wrote that I hoped my blog would help people trust the MacBU a little more. I can see that many of you are very mad about this decision; I do hope that my post today helps you see some of the issues behind the press release. We had to make a hard decision one way or the other, and this is how it turned out.