As an ArgoUML contributor I'm going to blog my activities here, so that they may draw interest by other developers or help other developers when doing tasks similar to what I've done. AND(!) the grand vision that makes an Argonaut what he is, TO THRIVE IN THE BIG DANGEROUS WORLD, TAKING THE Argo TO A GOOD SHORE ;-))

Friday, November 20, 2009

UML profiles management issues

I'm currently handling two issues related to ArgoUML management of profiles:

Monday, November 16, 2009

Other software ideas

These aren't related to ArgoUML, but, since they are strongly related to software, they'll be registered here.

Emacs Lisp Unit

Build a unit test framework for Emacs Lisp and with that learn more about Emacs Lisp. Upsss, there is already ElUnit and behave.el.

Contribute to jnil

jnil is an open source project by a colleague from SISCOG, Tiago Maduro Dias. Initially I wanted to join some open source Common Lisp project and the prospect of being able to pair program an open source project with a more experienced Common Lisp developer was very interesting. Unfortunately, I didn't want to leave ArgoUML without reaching a feeling of closure, so, I never joined. It is still an interesting idea...

Genetic programming based in the Common Lisp Condition System

Since I read about the idea of evolving software with genetic algorithms, I was fascinated by it. When I really learned about the Common Lisp Condition System and its capability to offer restarts I wandered if it would be possible to combine the two to provide a way to avoid the need to have mutations to generate correct Common Lisp. It might be a good way to make safe boundaries between mutated code and stable code, to avoid infinite loops, deadlocks, etc.

FitNess Common Lisp adapter

Create an adapter for Common Lisp for FitNess, to enable using this acceptance test framework in Common Lisp projects. With ABCL maturing very fast and being well supported, I wonder if this would be difficult at all.

Learn Clojure

Give myself some time to learn Clojure.

Develop organization absences planer in Google App Engine

The process(es) that organizations normally use to plan their collaborators vacations and absences are usually document driven and the verifications and checks of accordance with the rules and legislation manual. I suffer from this at SISCOG and in SIEMENS, although it wasn't document based, it lacked considering the organization needs, like to have a minimum number of persons in a certain development team available at all times in a year. It didn't considered previous history, like a certain person always not getting his vacations in the Christmas season while others always get them.

It would be a perfect opportunity to learn web development techniques in a restrained platform such as Google App Engine, but, possibly using a JVM based language, such as Clojure or Jython. I wanted also to show this at SISCOG, to have a demo-able example of a Cloud based application, which someday SISCOG may want to try with more commercial purposes...

Online git repositories

I started using distributed revision control systems (DRCS) a while ago, first with mercurial (hg), then with Git, when my ex-colleague Tom Schutzer-Weissmann created a system to import the SISCOG repositories from the in-house defined format (CRM/CRI) into Git. Currently the Git repositories are being used by several persons in SISCOG, but, I still haven't been able to proceed with the work enough to consider the proposal of a switch or a pilot.

Currently I'm more familiar with Git, but, my problem is that I'm not cooperating online using one of these DRCSs. So, I wanted to setup a GitHub account and move in there some of my repositories and potentially host there a fork of ArgoUML. By having a Git based ArgoUML fork, I would have a much easier life in working offline with contributions and a much nicer sandbox to experiment ideas and patches. This I know for sure, because I have been using the branching features of Git in SISCOG for some time with a huge success and satisfaction.

Finish reading Practical Common Lisp

I stopped reading Practical Common Lisp in the chapter 23. Practical: A Spam Filter. I should give myself some time to finish reading and exercising the book. While at it, I should buy a copy of it, since it is excellent and I would have something physical to borrow to newcomers at SISCOG.

Proceeding with ideas on books to read, On Lisp, by Paul Graham is high on my Lisp related books and several others I placed in my Amazon wish lists.

Two ideas related to ArgoUML

Following with my write-up of ideas to avoid neurological garbage collection before delving into ArgoUML for some time, follows a list of ideas related to ArgoUML, in which I won't elaborate as much as in the entry about medeia.

abstracter

The idea is to register in the background decisions that the ArgoUML modeler does, so that we may deduce from his actions a set of steps that might be used to solve similar problems. This is similar to a refactoring system that my colleague Rui Patrocínio explained to me. This system, takes a set of programmer actions and may apply them to a similar problem upon a programmer request.

This is called abstracter, because the goal would be to give such a tool the reverse engineering information from a software project and let it abstract away the details, creating easy to use diagrams and perspectives about the software. Kind-of something like pointing it to the OpenOffice.org source and letting it create some meaningful abstractions of it. The part where it is learning or being corrected by a developer is interesting, because it would allow experimentation with machine learning principles...

ArgoUML Lisp module

Implement a Lisp language module for ArgoUML. Probably this could have dialect specializations (Common Lisp, Scheme, Clojure, etc), but, have a reusable base. My idea is that this should be implemented in the specific languages and not in Java, in order to more easily receive contributions from developers that are primarily interested in the module.

Note: Rui Patrocínio referred a related project he was considering to do which is a static analyzer for Common Lisp so that it would be possible to support code browsing and refactoring.

The reverse engineering of Common Lisp, Scheme and Clojure should be based in the same communication protocol as the one used by SLIME, which allows things like running APROPOS.

Sunday, November 15, 2009

medeia

From all the ideas my idiot's head has had in the past few years, one has been buzzing very strongly and that is medeia. The name Medea comes from the Argonautica book, so, it is related to ArgoUML and it is the name of a woman that helps the Argonauts and Jason in particular to get the Golden Fleece. She knows one thing or two about magic and uses it. Medeia is Medea in Portuguese related to mediate. Well, the name points a bit to what the idea is about...

medeia is a good name because it is similar to idea in Portuguese, which is written ideia.

As the responsible and main contributor to the ArgoUML C++ module, in the past 5 years, I've seen that this – making a language module – is a hard task. There is a lot that must be considered to provide production ready code generator, reverse engineering, notation and round the horn language module(s). Normally the effort is single handed and when you reach the end, there will always be loopholes and needs to extend it. But the worst part is how much of reinventing the wheel goes into it. Lets put the things in a clear way, by enumerating each part of a typical language support module:

  • UML Profile for the language:
    • You need a core language profile with stereotypes that may be used for tagging UML model elements, in order to apply tagged values to the model elements.
    • To use the profile in a programmatic way you need some wrapping classes, such as the class ProfileCpp in the C++ module.
    • Potentially you would also want to place in the module UML profiles for libraries that are used in plenty of projects that use the language – in the case of C++ we may have boost.

  • Language generator:
    • Use of a template engine for formatting the generated code. (For an example see StringTemplate.)
    • A code generator, which maps the UML model into an Abstract Syntax Tree (AST) which would be handed over to the template engine.
    • A way not to trash the code that already exist, like the current protected sections in the C++ generator.
    • Some form of mapping the documentation in the UML model into the language standard(s) format(s). And you need this to be customizable because projects typically develop specific variants of the documentation standards or there is some company standard that they must use. I would say that this fits more or less in the same problem/solution as the code generator being made by two parts – a mapping from UML to an AST and a template engine which is fed by the AST – but, I think that here there might be more needs for giving customization options for enhancing the AST generator.
    • Potentially there is the need to create code from specific UML diagrams or the model elements beneath specific diagrams. For instance, generating the classes that implement the state design pattern for a state machine (check also this nice article by Robert C. Martin (aka Uncle Bob) on finite state machines and UML.
      As a side note, you may not need to do this if you use the ArgoUML Pattern-Wizard module.

  • Language reverse engineering:
    • There must be a language grammar, typically, in ArgoUML, this is based in ANTLR and it is common that there is something for your target language.
    • With the language grammar you can generate a Parser, it might even be possible that an AST walker is generated for you, but, your mileage may vary and the available Parser and AST walker might not be fully customized to your needs. For instance, I was capable of getting a ANTLR C++ grammar ported to Java, but, this wasn't prepared to make an AST and I failed to change it into one. It has some bugs which up-to-now I didn't solve. Finally, I'm the one that must keep it evolving with the original grammar which was made for C++. Even the original grammar is now unsupported, which is problematic with the evolution of the C++ standard to C++ 0x.
    • Even with the AST walker there is the need to perform the mapping from language constructs to UML model elements. This mapping must be performed in a way that doesn't create duplicated model elements when the same source files are imported multiple times.
    • The grammars are normally created for blue sky situations where you have all the information, the source code is well formed, etc, etc. The reality is that the reverse engineering modules must be designed to handle malformed sources and work with incomplete information – i.e., partial imports.
    • In C++ you need a C preprocessor. Fortunately there is one open source C preprocessor implemented in Java which the C++ module now uses.
    • Users normally aren't satisfied by simple import and they normally want that the reverse engineering creates some simple UML diagrams from code. The Java module has this.
    • Finally, the killer feature of a UML reverse engineering module is for a user to point it to a build definition file – for instance a make file for C++ or an Ant build.xml file – and the reverse engineering module would interpret it and import the underlying project fully into a UML model.

  • Language notation – this enables the user to edit UML with the language notation.
    • There is a simple way to create a read-only notation for the language, which is basically to use the code generator in a partial way. That is what the C++ module does in the notation package.
    • The complex part is to create a parser for the notation. The best example is to check the parser that Michiel van der Wulp developed for UML notation. Neither C++ nor Java have a notation parser. If there is such a parser for the language notation it would enable fully editable UML models in the programming language.

  • Round the horn engineering for the language.
    • This is the Lucy in the sky with diamonds scenario. You edit the model and say update code and the code is magically updated without trashing the existing code and potentially adapting the dependent code to the changes you made. Even better, it would understand enough of the build definition to add new files to it or remove unused ones.
      My knowledge about tools that support this is that Together did a good job and that Rational Rose for M$ VC++ did a bad job at it. It is a long time since I tried tools such as these in a professional setting.

  • Language module settings user interface – finally, for each of these things you need to consider how the user is going to interact with it and configure it to his needs.
    • A nice GUI for some things, like the generator options concerning the mapping from UML model elements into the language constructs (e.g., see SettingsTabCpp, but, this has some mixture of several parts of the generator).
    • The reverse engineering also has a GUI which is placed in the dialog that is shown to the user when importing code. Currently the C++ reverse engineering module has no such GUI, but, you can see an example in the Java module – JavaImportSettings.
    • Then there are parts of the UI that may have the form of a template file, but, this is UI and, as such, the template engine must deal with the potential errors introduced by changes the users did.
    • All of this and more must somehow be documented in the user manual, which potentially is translated into other languages...

Description of medeia

So, enough of complaining myself about the fun, but, hard and difficult to finish work of developing language support for ArgoUML in the classic way. Lets now delve into what medeia is!

The medeia idea came to me a long time ago, but, in a gradual way. First I thought about a different solution to develop the reveng C++ module, which was to base it in the reuse of libraries made by the Eclipse CDT project. Then, the obvious stroke my mind and I thought of this as a potential solution for several other language modules, just as the Eclipse guys add support for more and more languages, the ArgoUML could grab it and reuse it. Recently, with the evolution of polyglot software development and the dynamic language boom, both Eclipse and NetBeans are adding support for multiple and interesting languages.

Then, with the Argoeclipse project, I thought that this would be a perfect opportunity to revisit this, but, instead of reusing libraries by placing them in the ArgoUML modules, we could just mediate the things that the existing Eclipse environment knows about the projects the user has in his Eclipse workspace and map those into ArgoUML and back, reusing for this the whole infrastructure that Eclipse has. Note that this is also possible for an eventual Argonetbeans.

Finally, while working at SISCOG and understanding the way Emacs connects via SLIME or ELI to a live Lisp and how that enables you to discover on a live and fully consistent software program about its static structure, I came to the conclusion that medeia could be a bit different for languages that have reflexion built-in (Java included)! You just need to place a little piece of software within the program you want to know about and inquire it from the ArgoUML side. No need to parse source files, understand build systems, etc, etc. Well, you still have some work for generating code and to provide a language notation, but, the medeia idea solves a large part of the problem of reverse engineering modules for a language. It creates some new, though...

Classical reveng module medeia reveng module
Language grammar, AST walker, parser, etc Interface with the IDE (for languages that don't have powerful enough reflexion facilities, such as C++, C, Fortran and etc) or with the live program (for languages that do have such reflexion facilities), that communicates with the ArgoUML module or with Argoeclipse or with Argonetbeans or with Argowhatever.
Mapping from language to UML. Ditto, also needed here.
Robust software to deal with incomplete information or malformed source code. Software that must support some way to filter the information that matters, maybe by allowing the user to specify that the import must be from specific packages or / and by allowing to get the information by specifying incomplete names – similar to IDEs' find type.
C preprocessor or similar things, largely dependent on language quirks. Nothing like that needed on this side, thanks :-)
UML diagrams auto-magically created by the import sources process. Ditto in here too, but, with the difference that probably the user might need to ask for diagram creation and update or offer some settings and try to do the correct action for different user actions.
Understanding build definitions to import full projects with the minimum work by the user. No need for this just for reverse engineering, but, in complex multiple-project scenarios, maybe the users want that the correct UML elements go into the correct UML models... The filtering described above might be a good way for certain languages, but, for languages that simply support namespaces or that are used in a flat package way, like Lisp, this wouldn't work so nicely.

Do you like this idea? Have you seen it applied to other UML tools or to solve similar problems? I recall that this is already used in the context of Emacs and Lisp / Scheme / Clojure, with the SLIME project or others similar to it. But for a UML tool you normally go the way of the classical reverse engineering design described above. Well, maybe Together worked this way when integrated within Eclipse...

Wednesday, November 04, 2009

Dispersion problem – why am I contributing so little to ArgoUML?...

At SISCOG some of the employees – I included – attended a leadership workshop that lasted 3 days and was distributed by February, March and April of 2009. It was very interesting and useful for the development of my soft skills, which I need as part of my role as a team leader. The person giving the workshop was Teresa Roseta of MQI, whom I strongly recommend. SISCOG developed the leadership principles and values in the workshop.

One of the things that was referred there was the need for a person to know himself better. This because it seams that normally good, natural leaders do know themselves very well. One exercise we did for each of us was our professional coat of arms. It is a bit like a personal SWOT analysis, but, the idea was to minimize the number of things that go into each quadrant and to have a middle row with your motto. The things that go into the quadrants are different too. To make a long story short here is my professional coat of arms (the words in [emphasis] are just legends for the contents):

[Strength] easy learning and appreciation for knowledge [Dream] to fulfill one or more remarkable ideas (hint: MEDEIA)
[Motto] do things right (i.e., well/correctly)
[Vulnerability] dispersion [Facilitator] persistence

My vulnerability is dispersion. This causes some problems for my average rate of contributions for anything that isn't my primary activity and currently for my contributions to ArgoUML, which are too few and too slow. Issue 5864 illustrates this. I was given a patch, so, I looked into it and thought, OK, lets help this person scratch his hitch and maybe ArgoUML can get a new developer aboard. Then a difficulty came and I feared that it was something that was related to a part of the code that I never delved into and I placed it on hold until I had enough energy for it. But, in the meantime I had enough energy to do a Programming Praxis problem, to start developing a friendly user interface for htmled, to start a crazy attempt to update all the software in my old Windows XP laptop and to investigate the possibility to move some personal Subversion repositories into Git! At least these are the significant things I remember about between 7th and 30th of October. Why didn't I finished the thing I started in the first place? Dispersion!

I disperse because I'm the kind of person that loves to learn and I have lots of ideas. These ideas are all very interesting for me, but, they take some time to look into, develop and document. They are mostly related to software development, but, they don't intersect as much as possible for them to be developed together or in order for some parts to add to the others. So, these things start conflicting with each other and I don't get much done of anything unless at my work, where I managed to finish some significant things.

I like ArgoUML. The developers there aren't the greatest community ever, but, they are nice persons who care most about making ArgoUML incrementally better and some are pretty decent software developers with high commitment level. I think that I contributed some important work for the C++ module and did some nice work in the UML profile support. But, I'm involved in ArgoUML since 2004 and the pace of my contributions has been very snail like or even slower. Even worst, both the C++ module and the profile support seam a bit of an half-finished business. I think I know where the problem lies – dispersion!

So, I do I fight this? I do it by setting up a challenge for myself – spend one year focused at contributing to ArgoUML. Forget about the software blogs, software books and exercises. Blog about ideas I have, take some notes, so that they don't get neurologically garbage collected, but, then go back to work. In the end, take some ArgoUML vacations for some undetermined time, look at one of these ideas and pick it up if I want to give it a try. (Probably I will want to do something related to my work at SISCOG and if so, it may take one year or more to finish, but, I don't know and for now don't care what that will be.)

So, my aim is mainly to move profile support in ArgoUML up into something that looks a lot like finished business and try to do the same thing for the C++ module (this one is harder, though).

But first, as a last bit of devilish concession to my dispersion I'll spend some time writing and publishing some of the significant ideas in either Argonaut's life if they are ArgoUML or software development related, or in Idiota if they aren't.

Oh, and I have an idea for a trick that will help me not to get distracted. I'm going to get myself an ArgoUML bracelet, just like Uncle Bob has his green wrist band to remember himself that code must be clean and tests green. Maybe I should get a green ArgoUML wrist band!

Reader Shared items

Followers