As an ArgoUML contributor I'm going to blog my activities here, so that they may draw interest by other developers or help other developers when doing tasks similar to what I've done. AND(!) the grand vision that makes an Argonaut what he is, TO THRIVE IN THE BIG DANGEROUS WORLD, TAKING THE Argo TO A GOOD SHORE ;-))

Wednesday, December 05, 2007

2007-12-05

I'm working now on issue #4923 and this requires some measurements of the time the automated headless tests take to run. Thanks to Linus' efforts in setting up a continuous integration server this isn't very difficult, I will simply take the latests 10 results from the revisions of the JUnit reports summary and compare them with the next 10 results after making my change in the model-mdr implementation.

So, here are the values and the total average before the change:

Before changes, Java 5 tests.
revisionTestsFailuresErrorsSuccess rateTime (s)
rev=445611151099.91%3065.731
rev=444311151099.91%2921.652
rev=443111151099.91%2899.956
rev=440811172099.82%2792.906
rev=439211163099.73%2789.595
rev=4378111600100.00%2866.223
rev=4367111600100.00%2842.319
rev=4355111700100.00%2861.930
rev=4345111700100.00%2887.080
rev=4340110100100.00%2787.845
Average time:2871.524

 

Before changes, Java 6 tests.
revisionTestsFailuresErrorsSuccess rateTime (s)
446211151099.91%2715.539
444811151099.91%2665.635
443711151099.91%2665.288
440011163099.73%2526.352
4383111600100.00%2627.562
4372111600100.00%2596.418
4360111700100.00%2651.676
4350111700100.00%2654.227
4332110100100.00%2514.612
4318110100100.00%2509.551
Average time:2612.686

Tomorrow I'll make tests with ArgoUML running to check if the MDRModelImplementation constructor is called during its execution and if not I'll commit my changes to MDR. Then, it is a matter of waiting 10 days for the veredict of the performance hit.

Update on 2007-12-19: added averages and results after changes.

Afer changes, Java 5 tests.
revisionTestsFailuresErrorsSuccess rateTime (s)
4611111600100.00%2910.440
4594111600100.00%2917.587
4581111600100.00%2905.284
4570111600100.00%2903.603
4559111600100.00%2898.110
4546111600100.00%2909.500
4533111500100.00%2906.624
4521111500100.00%2904.351
4508111500100.00%2939.427
4496111500100.00%2903.854
Average time:2909.88

 

Afer changes, Java 6 tests.
revisionTestsFailuresErrorsSuccess rateTime (s)
4586111600100.00%2665.922
4575111600100.00%2665.115
4564111600100.00%2660.793
4552111600100.00%2677.356
4538111500100.00%2668.918
4526111500100.00%2677.958
4514111500100.00%2699.183
4501111500100.00%2687.672
4488111500100.00%2630.221
447411151099.91%2757.297
Average time:2679.04

So, a ~2.5% performance hit in Java 6 and a ~1.3% hit under Java 5.

Monday, November 19, 2007

2007-11-19

While I was testing a patch by Lukasz Gromanowski, found a bug in org.argouml.ui.SettingsDialog. This one is interesting. The contract established by GUISettingsTabInterface is that implementers will be called when the user saves the configurations. But, that wasn't happening for SettingsTabCpp. After some debugging and seeing the bug in my front several times I finally understood.SettingsTabCpp does not extend JPanel and the SettingsDialog was only invoking the callback methods of tabs contained by its component tabs (an object of type JTabbedPane), which were of type GUISettingsTabInterface.

Monday, November 12, 2007

My promotion to core developer

I was promoted to core developer by Linus Tolke and I believe in agreement with the other active core developers – Bob Tarling, Michiel van der Wulp and Tom Morris. I'm very happy about this and it will be motivating to start working more often in the core of ArgoUML.

Tom sent a very warm welcome message to the developers mailing list. Specifically for my part he refers two features I'm keen to get involved into, the profiles and the support for parameterized classes and UML templates in general. Well, I was already involved, but, now I'm much more motivated to work directly in the core to advance these two features. These are central for ArgoUML to be a good basis for C++ model driven development, which will continue to be my main focus for the future.

Refactoring org.argouml.uml.profile

As stated in issue #4885 and in the dev mailing list thread "org.argouml.uml.profile - success in working from models and improvement proposals", I'm working in refactoring the profile sub-subsystem of ArgoUML so that it is possible for modules to define their own profiles. The main problem to get rid of is that there are singletons in it, such as the ProfileManagerImpl. This class is a singleton and besides some very few exceptions singletons are an abused design pattern. More so in Java projects, since Java doesn't support global variables and, guess what, a singleton is a replacement of the humble global variable, even if it is very clumsy and pernicious.

Before proceeding with my rambling about singletons being bad, let me say that I'm very happy with the work contributed by Marcos Aurélio, one of the developers that joined in the Google Summer of Code 2007. The package org.argouml.uml.profile is congruent, being absent of monster classes and methods, with a nice balance between abstractness and implementation classes and with a very pleasant distribution of responsibilities amongst the classes. Furthermore, so far I haven't found a single defect!

The refactoring will be much more easy than what I will have to do to improve the GeneratorCpp – which is a singleton :-( ... and that bring us back to my rambling...

Singletons are clumsy, because instead of one line of code declaring a variable at global scope1, one line initializing it in some appropriate place of your code and a direct reference from where you want to access it, you now have to define a private constructor, a static accessor method and then, call this method from wherever you need to access it from. It is pernicious because if you wanted to abstract the implementation of the global variable, it won't be possible – every single user object will now refer to the singleton class directly, even if they don't need to, because one of their owners or more closely related objects could have provided it themselves. Another bad effect is that when eventually the application evolves and you would like to have more than one of those objects, or to have a fresh one for another piece of work or for unit testing, you'll have the singleton and the singleton accessing code stopping you from doing it.

Today I found yet another pernicious effect – loss of control of initialization order. In my checked out copy I have made a spike to check if the plan of having the C++ module providing the UML profile for C++ would work based on the recently contributed support provided by Marcos AurĂ©lio. So, in C++ module I have defined a new ModuleInterface implementation that registers a ProfileCpp object in the ProfileManagerImpl instance. Now, guess who is now instantiating indirectly both ProjectManager and ProfileManagerImpl? Yeah, the humble C++ module or better indirectly the ModuleLoader2 is doing it! Worst, while at it, I noticed that the order of initialization of modules by ModuleLoader2 seams to be arbitrary. Because SettingsCpp is also loaded as a module, and it accesses the GUI instance, this will also be initialized not by the Main.main in an explicit way, but, indirectly by the SettingsCpp module.

Isn't this bad?!?

The way to solve this is via explicit initialization of subsystems and subsystems which keep the details for themselves. The best example is the org.argouml.model subsystem. It provides means for explicit initialization and access to it is via static functions of the Model class.

Alas, the Profile subsystem may not need to have several implementations as required from the Model subsystem, so, I won't restrict its implementation so much, such as having only interfaces available for clients, but, I think it will be easier to use and maintain if it looses some of its singletonitis. Check the ideas in issue #4885 and please send some feedback if you have ideas on how to deal with this differently.

1 In modern programming languages, "global variables" aren't normally global anymore, since normally they are contained in a specific package or namespace. A pair of good examples is the java.lang.System in Java and std::cout in C++. These "globals" aren't frown at and the libraries were design by developers above the average for sure, so, why is an accessible variable so bad?!?

Wednesday, October 03, 2007

Uff! I finished the GeneratorCpp feature sketch

Previously I stated that I wanted to refactor the GeneratorCpp class. A good way would be to use some techniques I learned from Working Effectively with Legacy Code by Michael Feathers (2004), particularly, his feature sketches. Well, I worked hard to put the whole class under scrutiny and the result is very depressing (see bellow). Very different from the ones Michael has in his blog post.

Note that I used rectangles for functions and ellipses for variables. This is exactly the opposite from the convention Michael uses in his feature diagrams. My diagram is made in OpenOffice.org Draw and I had to use a A2 page so that everything could fit in a single one. I like this approach because, although it is harder to draw in the first place, when you have it you may drag things, group and ungroup, etc. This is important to identify and make clear the clusters that may be extracted from the monster with less pain. I put it in argouml-cpp doc directory, so, if you get curious, try it out.

GeneratorCpp feature sketch.

Picture 6 – Feature sketch of GeneratorCpp. Rectangles are for methods, ellipses for variables, blue for non-static and red for static. The yellow rectangle to the left denotes a cluster of methods that could easily be extracted, the yellow rectangle in the bottom center contains the methods I wanted to extract related to Associations, but, which are hard to extract from the class.

Don't misunderstand me, feature sketches are one more good thing that I will add to my tool box, but, they must be complemented with other things Michael talks about in his book, such as identifying responsibilities. This is more important if you are dealing with a monster class such as this. Nevertheless, the feature sketch enabled me to see the clusters in the class. Even more important, while I was doing it I reviewed the code in a way I never did before – actually this was the first time I looked at it from start to end. There are variables that keep pure processing state or context (generatorPass and actualNamespace), others that are mostly read and that keep configuration editable by users (e.g., indent, lfBeforeCurly, verboseDocs) and others that store processing results, which are used to generate code that deals with dependencies (e.g., includeCls, predeclCls, systemInc, extInc and localInc).

As a bonus I discovered some non-documented features of the generator that might come handy to solve some issues that are mounting up in the issues list. For instance, issue #22: provide a tagged value for user includes with angle brackets is handled by method addUserHeaders if the user places either in source_incl or header_incl tagged values the header name within angle brackets.

Monday, September 17, 2007

Refactoring GeneratorCpp

The GeneratorCpp class contains almost all the code that is used to provide C++ code generation in the C++ module. The file is over 2900 lines! It will start to get even worst since the support for C++ notation will reuse it, so, we might need to add even more code to it.

It is a monster class. I know that the responsibility of that class is generating C++ code from a UML model. So, a distant observer could say that this respects the Single Responsibility Principle (SRP). But that would be like saying that log4j could be implemented in a single class because it is software that has the responsibility to support logging.

I read the book Working Effectively with Legacy Code by Michael Feathers (2004) and there he describes very useful techniques on how to refactor such code. I'll apply the feature sketch in my current task, where I have to fix a bug related to the way the generator deals with associations, in order to extract some of the methods into a separate class. Then, I'm planning to test if the result is friendly from the perspective of client code from the notation package.

But, before starting to apply a specific technique, I must reason about the responsibilities of the C++ generator. Follows a – probably incomplete – list:

  • conversion of the UML constructs into C++ equivalent code
    • operations and methods
    • attributes
    • packages
    • associations – includes aggregations, compositions, generalizations
  • documentation
  • tagged values
  • coordination of the code generation for a class
  • C++ notation support
  • indentation
  • generation of header files
  • generation of source files

Many of these responsibilities interact and depend on each other. Because it is all contained in one class, all is getting mangled into a huge mess, although I can understand that it seams easier this way, I also think that this is simply at the surface.

My idea for now is too separate the aggregation and composition parts from the rest. But, it is difficult because there are parts in the methods that deal with these two things that also deal with indentation and documentation. This is messy because when I'm generating code I want it to use the indentation options selected by the user, but, when I'm providing C++ notation support I want them not to indent things and not to insert C++ headers into the required headers list. So, I'll look at the methods that support these things and use the feature sketch technique to understand how to isolate some parts.

Saturday, May 12, 2007

New job and Common Lisp

I recently started in a new job where we use Common Lisp as the programming language for all the core work. The company is SISCOG and it sells software for planning the Human Resources of train companies. In SISCOG they provided an intense 3 weeks course on Common Lisp, but, due to legal and project schedule constraints in my previous work at Nokia Siemens Networks, I only participated in the last day of the course. So, now, I'm learning on the job and by my own in my free time. I'm reading Practical Common Lisp, using Lisp in a box, and Franz's Allegro CL and it is going alright.

Because I was used to developing in Python I'm not finding so hard to get used to Common Lisp, even with the very different syntax. The main thing is being used to working in a dynamic language. Ah, and the Lisp prompt – or REPL in Lisp terms – is very good.

The hard thing is to get used to the libraries and toolset to the point where I'm so productive as I was in my previous environment (Java and Eclipse). Now I'm using emacs as the editor and Allegro/emacs as the IDE. Besides the GUI framework that comes with Allegro, everything is internal to SISCOG, so, I have a lot to learn. And from my previous experience, the best way to do it effectivelly is by participation in an open source project. There you normally get into contact to the best software engineering practices and best user / developers relationship that is possible. So, I already looked around and my two candidates for now are:

  • Maxima – a computer algebra system.
  • Closer – contains a framework for Aspect Oriented Programming in Common Lisp.

Sunday, April 29, 2007

Past and near future development activities in the C++ module

I didn't progressed much in the development of the C++ module during 2006. It was a fire fighting year in terms of my day to day job and I invested a lot of time and energy in getting the fire down and (IMHO) trying to get a rightfully deserved raise in salary. Alas, I'm now in a different job, but, lets talk about the progresses (humble, but, nevertheless some) in the C++ module...

The main problem I worked on was the fact that the C++ notation was gone by mid of the previous year. I think that in ArgoUML 0.22 it wasn't available. So, I worked on fixing it for ArgoUML 0.24 because I hate regressions!

Some minor bugs were found (4, 5, 6, 7, 4541), some of them were fixed (5, 6), there is a request for enumeration support (not yet done), etc.

Now, my main focus is in getting an old issue fixed and unify the handling of special TaggedValues in the generator and reveng modules into the UML profile for C++. This is turning out to be easier than I expected due to the improvements introduced during the last year into the model sub-system. I'll simply put the UML profile for C++ (UMLprof4C++) into the C++ module jar, load it into a separate model and use its services from the generator, reveng and ui modules.

Many hard coded values for TaggedValue names are going away, the code of generator and reveng modules will deal with similar problems by using common services and the users will have an easy way to import the full UMLprof4C++ Stereotypes from the C++ Settings. And I think that now that I have more energy to devote to ArgoUML this will be available for 0.26.

Improving the usability of htmled

I have to improve the usability of my htmled Python module! Currently I have to do something like this:

>>> from htmled import *
>>> hbf = HbFile(open("c:/luis/documentos/cadernos/programacao/ficheiro04.html"))
>>> pe = PostExtractor(hbf)
>>> from datetime import date
>>> posts = pe.getPosts(date(2006, 12, 1), date(2007, 5, 1))
>>> print posts[0]
### here a pretty print of post 0, easy to copy paste into Blogger will be available
>>> print posts[1]
### ditto for post 1

I'm not very put off by having the python shell as UI. It is comfortable for me and I like its feeling as much as other more normal persons like their GUI applications such as ArgoUML. But, there are some things which I could easily optimize such as not having to write the name of the Handbook File, just select which Handbook I want to make blog posts from. Another thing is to have the posts immediately available for selection with an easier way to specify.

In addition there is a nasty bug in the parser, which includes the original Handbook File footer into the last post. These are sufficient reasons for an additional iteration on htmled. I'll plan for it as soon as I finish some stuff in the ArgoUML C++ module I've been working on in the past weeks.

Thursday, April 26, 2007

Argonaut's life lost its images

My previous e-mail and static web hosting provider (zmail) was discontinued by its parent company Zsystems.It was a pitty because at the time it actually worked, they were ahead of Google in what concerns advanced functionality – 10 GB of space for shared services such as e-mail, static web hosting, XFTP (important in Portugal because the internet service providers have international traffic ceilings).It seamed to me that they even had some non-portuguese customers paying 12 euros per year.

It is a common lack of vision and courage over here not to invest in such a popular and front runner service.They probably were a bit scared when Google started to be a no-cost competitor, but, for a person like me their service offering isn't yet equaled by Google or by any other service provider...

Alas, now the Argonaut's life blog has no pictures. I'm considering my options for placement of this kind of content, but, I would like to improve my online presence a bit, like having a CV online as well as references to this blog.

So, if you were wondering why these pages have no carefully crafted diagrams from ArgoUML, here, you now know why!

Making icons for ArgoUML

This entry comes a lot after the fact, but, nevertheless, it contains important details on how to get good icons for the ArgoUML GUI.Thanks to Michiel van der Wulp for supporting me on this work with icons in specific and implementing the C++ Notation in general.By mid of 2006 I detected that the C++ notation wasn't working no more in the ArgoUML diagrams and created issue #2 in the C++ module issues DB.The whole Notation sub-system was refactored by Michiel and the C++ module had to adapt.So, to make a long story short, I needed to create an icon for the C++ notation as part of the work and what follows is a recipe on how to do it:

Note: these instructions are for Windows. If you're on Linux, you'll have to look elsewhere.

  1. Start from one of the icons stored in the Images directory of the source code.
  2. Using MS Paint, you'll draw the icon according to your artistic capabilities. For this you'll have easier time by using gigantic zoom like 800%.
  3. Then, use IrfanView to reduce the color depth to 4 colors and to set the transparent colour.

But, if you want to see the whole story of having Notation icons working from ArgoUML modules, you'll have to use the a different loading code than what is used in core argouml code. Please check the code in NotationModuleCpp.java.

I hope this tiny cookbook helps others and avoids core developers having to answer to trivial questions again. And once more, all credit besides writing this tutorial goes to Michiel, who helped me a lot while I was doing this.

Reader Shared items

Followers