Part 1 (C#)- Part 2 (Testing C#)- Part 3 (PowerShell)
The source code for Part 4 can be downloaded here (Visual Studio 2008). Download and use at your own risk.
NUnit v2.5.0.9122 can be found here.
The reference in the test harness is to PowerShell 2.0 CTP2. Modify the reference – System.Management.Automation – and point it to your PowerShell; remove any lines that won’t compile
I am not sure if this post on C++ really belongs in this series but given I can reuse a common theme – generating code within a user interface and regression testing it using NUnit – I am grouping it all together. It’s oviously different: after generating .Net code or PowerShell scripts, you can compile that code and run it within the context of your application. You would (probably not!) do that with C++ code. So why the post?
These posts are about two things: generating code based on a user or developer driving a data structure (IBindingList in this case) and automating the testing of what gets generated in NUnit. The principle can be applied to any generated C++ code… but why would you use NUnit to test generated C++ code?! It sounds a bit bizarre, until you realize that NUnit can be used to easily launch Visual Studio and compile C++ code on your behalf… I tend to use this approach for establishing confidence in the generated code, and then use CPPUnit for the detailed testing. Alas. Let’s get on with it….
… well, almost
Why would you generate C++ code on the fly? For my API Modeling framework which has C# generated classes, it’s obviously worth generating macro code in C# and PowerShell (and VB.Net and Managed C++) code to show how to interact with that API. It allows third parties to work in the language they know. But that same API might also be hosted in an external web service; driving it as a client using C++ might be a requirement. Or from Java even. Or some other language. Rather than meticulously documenting an API in numerous languages, which I know from experience is surprisingly difficult and expensive to do well, I would prefer to have people ‘discover it’. By driving the application in the ‘master’ language (which contains the macro generators and so forth) in a UI, they can learn how to drive it from any other. And, if you can find a way of regression testing the generated code like this series shows you, you always know that what they discover is 100% up to date and usable.
But back to the code generation. We will stick to the sample I’ve used throughout but I’ll add another tab for C++:
To recall from Parts 1 thru 3: the user drives the GridView which is bound to an IBindingList; the IBindingList fires the ListChanged event; a handler on our form dispatches that event to the macro recorder which coordinates the code generation.
The C++ generator requires a different approach. It breaks everything, but I’ll come back to that. Repeatedly. First up though, there is no BindingList in C++. Instead, I will map the BindingList calls into ATL::CAtlArray calls.
And that’s where things start to break down.
Look at the generated code that creates the collection: it’s called pBindingList and not pATLArray or something else like you would expect. Why? The code generator is written in .Net. I am using the collection.GetType().Name from .Net to give it it’s name – it’s .Net name, more specifically. Is this a problem? Yes.
Rather than go into detail here, I have a ‘Deep Thinking’ section at the bottom that goes through these issues… but to all intents and purposes, you can ignore it
So… to testing
By now, the single test I have in the Test Harness should look familiar. You run the following code in the Test Harness…
System.ComponentModel.BindingList people = new System.ComponentModel.BindingList();
// We've set up the handler that will forward changes to the macro recorder...
// by driving the list directly, we can always view Recorder.Instance.Generators[0].Text
// to see what has been generated so far.
people.ListChanged += people_ListChanged;
Person woo = new Person();
woo.Name = "WOO";
woo.Age = 33;
people.Add(woo);
Person hoo = new Person();
hoo.Name = "HOO";
hoo.Age = 50;
people.Add(hoo);
people.Remove(woo);
… and by the time you get to the end, the CPPGenerator.Text field looks like this:
ATL::CAtlArray* pBindingList1 = new ATL::CAtlArray();
CPerson* pPerson1 = new CPerson();
pPerson1->SetName(L"WOO");
pPerson1->SetAge(33);
pBindingList1->Add(pPerson1);
CPerson* pPerson2 = new CPerson();
pPerson2->SetName(L"HOO");
pPerson2->SetAge(50);
pBindingList1->Add(pPerson2);
delete pBindingList1->GetAt(0);
pBindingList1->RemoveAt(0);
You substitute that text into the C++ template:
#include "stdafx.h"
#include "atlcoll.h"
#include "Person.h"
#include "XmlDumper.h"
class CGeneratedTestRunner
{
public:
// Execute the generate code and dump the output to the console.
static void Run()
{
//%CONTENTS%//
CXmlDumper::Dump(pBindingList1);
}
};
And then you test the resulting code:
#include "stdafx.h"
#include "atlcoll.h"
#include "Person.h"
#include "XmlDumper.h"
class CGeneratedTestRunner
{
public:
// Execute the generate code and dump the output to the console.
static void Run()
{
ATL::CAtlArray* pBindingList1 = new ATL::CAtlArray();
CPerson* pPerson1 = new CPerson();
pPerson1->SetName(L"WOO");
pPerson1->SetAge(33);
pBindingList1->Add(pPerson1);
CPerson* pPerson2 = new CPerson();
pPerson2->SetName(L"HOO");
pPerson2->SetAge(50);
pBindingList1->Add(pPerson2);
delete pBindingList1->GetAt(0);
pBindingList1->RemoveAt(0);
CXmlDumper::Dump(pBindingList1);
}
};
Ahem. Well. You will eventually. Testing C++ requires a bit more infrastructure to work. Well, perhaps not, but for the purposes of this sample that’s what I’ll do. You need to create a Visual Studio Solution (a Win32 Console App in my case) that does as little as possible. You then structure that solution so that one of the files can be overridden by the test harness:
That solution is also part of the source code you can download above.
What happens next is fairly obvious: as part of the test harness, you overwrite that file with the contents of the code you want to compile. Then you need to compile the code automatically… how?
Visual Studio is very automation friendly (I wouldn’t like to work on the Visual Studio Extensibility Team though: they’ve done a cracking job of opening up Visual Studio since 2005, but by looking at the forums it seems no one will ever be happy with what they’ve done. They always want more!).
So to test the C++ generated code, the Compile method does this:
// TestHarness/CPP.cs
//
// ... bits missing ... the source file is written out before this happens ...
//
Type vsType = Type.GetTypeFromProgID("VisualStudio.DTE.9.0");
EnvDTE.DTE visualStudio = Activator.CreateInstance(vsType) as EnvDTE.DTE;
visualStudio.MainWindow.Visible = true;
visualStudio.Solution.Open(BaseLocation + @"\CPPConsoleApp\CPPConsoleApp.sln");
visualStudio.Solution.SolutionBuild.SolutionConfigurations.Item("Debug").Activate();
visualStudio.Solution.SolutionBuild.Build(true);
Assert.AreEqual(0, visualStudio.Solution.SolutionBuild.LastBuildInfo);
It’s straight forward: open Visual Studio, make the window visible, open a solution, build it then check there are no errors. As an aside, I tend to always make the launched Visual Studio instance visible. Why? If a test fails, I tend to ensure that Visual Studio remains open (if I am working interactively). This gives me a chance to investigate the problem in the context of what is going on right now.
By driving Visual Studio in this way, I’ve had to add the EnvDTE80.DLL as a reference to my project.
But how to test it? How to make sure we have built up an AtlArray the same as our BindingList? First of all, let’s articulate what I’m trying to test in this case: although I’m not doing this here, imagine I developed a collection in C# and I want to ensure that the collection behaves the same in C++. I have generated classes in C# and C++ and I need to ensure they are serialized the same in both languages to facilitate data exchange. I need to compare the built-up C++ object with my C# one to ensure they are the same. They should be, or something is inconsistent. How to compare? Let’s see how the IBindingList implementation looks when it is serialized in C#:
HOO
50
Yup. That looks nice. I think we’ll use that! When the console application is run, the C++ code will dump out an Xml description of its built up object. We won’t worry about making the output exactly the same, character by character: instead, we’ll just concentrate on making the Xml such that it can be deserialized back into an IBindingList instance. We will then construct an object using that Xml in C# before comparing it as part of our test. Crude, but effective.
If you look at TestHarness\Templates\CPP.TXT file you will see the Dump method at the end:
CXmlDumper::Dump(pBindingList1);
Where does the C++ Console App dumps its data? StdOut is probably the best place for a test. When we run the app – in TestHarness/CPP.cs/Execute – it looks like this:
// CPP.cs
//
// ... stuff missing that sets up the cmd.exe call
//
m_command.Start();
string output = m_command.StandardOutput.ReadToEnd();
m_command.Close();
visualStudio.Solution.Close(false);
visualStudio.Quit();
// We can now try and instantiate our BindingList using the string we got back.
System.Xml.Serialization.XmlSerializer s = new System.Xml.Serialization.XmlSerializer(typeof(System.ComponentModel.BindingList));
object result = s.Deserialize(new StringReader(output));
return result;
By using StdOut we don’t need an intermediate file. You can see the Deserialize method: this reconstructs (in .Net) a BindingList based on the Xml that was output by running the C++ Console app. The value that is returned is given back to the test harness and compared with the one that was built up manually in C#.
Gotchas
There was one major problem I encountered trying to get this going. Sometimes when driving Visual Studio through Automation from a remote Multithreaded application you get these errors:
Application is busy (RPC_E_CALL_REJECTED 0×80010001)
Call was rejected by callee (RPC_E_SERVERCALL_RETRYLATER 0x8001010A)
At random. It’s very, very, very, funny.
The solution is to read this article: http://msdn.microsoft.com/en-us/library/ms228772.aspx
Or, if you can’t be bothered, look at the bottom of the TestHarness/CPP.cs file. On investigation, their solution was not working – CoRegisterMessageFilter was returning a bizarre HRESULT that not even Google knows about. So one can conclude: it does not exist. The reason was that my NUnit tests were being run in an MTA when I thought they were being run in an STA (NUnit uses a Multithreaded Apartment so it can update the tree view as the tests run… awww, how nice!).
Bottom line: I needed to run my test in an STA before I could call CoRegisterMessageFilter and the way to do that was to modify the TestHarness.dll.config file so it contained this entry:
And for some reason the syntax highlighter has changed that beyond recognition. I suggest you look at the one in the attached zip file.
The point of note is the =”STA”. I have put an assertion in the CPP.CS to ensure that Threading.CurrentThread.GetApartmentState() always returns STA. Without that, you will get issues driving Visual Studio through automation in your NUnit tests.
Tchau!
Deep thinking…
This section is an addendum and contains a lot of stuff you probably don’t need to worry about unless you are serious about generating C++ code on the fly. Bottom line: it’s harder than you think
Originally, I asked: is it a problem if the variable is called pBindingList1 instead of pATLArray? Aesthetically, at least, yes. But solving that problem requires a surprisingly large amount of infrastructure around it. If I was generating code on the fly for C# and C++, I would (almost certainly) be doing so from a common model. I would then be calling collection.GetModel().Name or something similar instead of collection.GetType(). Assuming it’s a class model for now, that generated code would probably use an infrastructure I had developed in each language that tried to make fundamental data structures the same across all languages. I would perhaps create a collection class in each language that exposed exactly the same methods: Add, Remove, Swap and so forth. Or, I would ensure there was a suitable semantic and syntactic mapping between my C# collection type of choice and the one I used in C++. At that point, I would know when I generate the macro code for C++ what collection type to use and therefore what naming convention. At the moment, I don’t have that information around and I just have to guess.
Generating C++ code automatically as you go along brings up *ALL KINDS OF ISSUES!*. In the grand scheme of things they aren’t that important but they are worth thinking about. It’s probably the hardest language to generate code for and yet probably the one that needs it the most. When you can do this, from a model, pat yourself on the back
For example: if I add an item to the binding list, I generate code for it in C++ like this:
CPerson* pPerson1 = new CPerson();
pPerson1->SetName(L"Graham");
pPerson1->SetAge(33);
pBindingList1->Add(pPerson1);
If I remove the item from the binding list… do I destroy the pointer in C++? Although removed from the list in C++, that object might still be part of some ‘greater’ model in C++ so it should not be destroyed. Afterall, even after removing an item from the list in C#, I can still change one of its properties in the UI and get the Macro Code for that property change in any language. By definition, if I can still set the property in C# I still have a reference to it… it will never be garbage collected.
No such beauty exists in C++.
In C++, if I delete the pointer when the object was removed from the list, that object would no longer be around when the property change came through. I’ll leave this one with you to work out. The best you can do with generating C++ code on the fly like this is probably to just get people up and running with your interface. That’s usually one of the biggest hurdles when trying to integrate with third party software anyway. What I do in this sample is destroy the pointer and (in CPPGenerator) remove the object reference from the NamingManager. That way, when the property change comes through, the object reference is unknown in C++ and will be built up. Or rather, I would remove the object reference from the NamingManager IF I KNEW WHAT OBJECT HAD JUST BEEN REMOVED! I only know it’s index because ListChangedEventArgs.ItemDeleted does not give me the reference of the object it has just purged. Problems! So the sample will generate C++ code that leaks.
Or you might have something in your model that helps you. Perhaps you have the concept of a list that ‘owns’ it’s pointers; or a list that does not. You would then generate the code accordingly: if it’s removed from the list that owns its pointers, you destroy the pointer. If the list is just a ‘view’ or a ‘filtered set’, you leave the pointer alone. Once again, this comes back to how you model your software and whether you reuse that model throughout your life cycle. If you have this kind of information in your model (and there might be good reasons for NOT having it there – perhaps it’s implementation specific, or too low level) then you can certainly use it here providing the model is accessible at this point in your application.
There’s lots of other issues too, some related to the above: previously, because I was generating macros for C# and PowerShell, I could just use .GetType() to find out the type I was referring to when I generate the code. I can’t do that in the C++ generators. If I was serious about generating C++ code, I would need some way of finding out – in C# – what the C++ name for an equivalent class was. For example: it’s MacroSample.Person in C#; but it might be SomeOtherNamespace::MacroSample::CPerson in C++. That kind of information comes from the model and – particularly – the job parameters used to generate the C++ classes you are now writing the macro generators for. We also need to consider things such as the ‘namespace separator’ used in different languages (. vs :: ).
BOTTOM LINE: Generating C++ code from C# without having a model around is hard and is quite different from the Managed languages. To get it right, and on-the-fly generated C++ code at least semi-usable, requires quite an investment of time and effort. I can’t think of any other way of tackling these problems without modeling your software, generating the code and having the macro recorder interpret the model at runtime. Nor can I think of any way of solving the ‘dangling pointer’ issue that generating C++ code on the fly will eventually bring up. Probably the best we can do is put comments in the code to direct the user.
But anyway. All those problems are not an indication that we should not generate C++ code; they are a hint we need to think about the problem and solution a bit harder!