As an ArgoUML contributor I'm going to blog my activities here, so that they may draw interest by other developers or help other developers when doing tasks similar to what I've done. AND(!) the grand vision that makes an Argonaut what he is, TO THRIVE IN THE BIG DANGEROUS WORLD, TAKING THE Argo TO A GOOD SHORE ;-))

Tuesday, May 24, 2005

PluggableImport implementation and 4th plan update of reveng drop 1 (also some new ideas)


I finished reading the book The C++ Programming Language 3rd edition, by Bjarne Stroustrup, the creator of C++. This was one important step to master this language. Previously I read introductory books by Bruce Eckel, which put me on track, but, having read Stroustrup's book was needed in order to have a definite understanding of the language. One thing I may regret was not having solved the exercises. At least I should have solved a selection of three or four per chapter.

So, next steps? I think that continuing with the work in the C++ module is a nice complement to the book. After reverse engineering is working satisfactorily, working in a C++ profile, which makes the standard library available to be used in models, is something that would naturally make my knowledge improve. I could also get involved in an open source project where C++ is the language of implementation. Another possibility is to work in the Doer enterprise pattern. We'll see...


htmled, the handbook editor blog and publisher

I take notes in my personal programming handbook. This handbook is digital and written in HTML. It is from the handbook notes that I create the Argonaut's life blog. It isn't something that takes much effort, but, this is the kind of repetitive task that I, as a Software Engineer, should consider for automation. Doing it manually is boring and error prone.

Previously I considered automating the edition of my handbook. My idea is to write a plugin for eclipse or a module for Netbeans, where all the things I make manually while adding content in the handbook automated. Also automated is the task of copying and adapting content from the handbook to the blog.

The handbook as a different structure from the blog, since I normally write top down and it is usually offline. The storage is as HTML files, and I try to keep these files under 100 KB. In the blog the last entry is made in the top of the page. It contains some automation that archives previous entries by month or year. Pictures and files aren't allowed and so I store them in my website. Also, I must take care to change the links I make internally and also to censor some parts of the handbook that I don't want to publish.

In a perfect solution, I would edit my handbook, creating entries and changing others within the IDE, while I work in ArgoUML or some other thing. I have support for automating some specific tasks, such as entering portuguese accentuation. It enables easy adding linking of content, distinguishing internal (within the handbook), local (within my PC) and external links. It must provide many of the basic functionalities of Web editors, but this isn't very important, since I'm comfortable with working in HTML source.

When I go online (yes, I'm not always connected!) and want to put some content in the blog, it will connect to the blog, discover what is changed and update it. The parts in my handbook I've marked as censored would not go online. A nice to have is for the plugin to warn me about some new comments and bring, as in a inbox scheme, show the headers and a small preview and allow me to see what people have been saying.

So, as with everything, I'm lazy and I should look for something that already exists, so that I may get back to my C++ module. Since I'm right now in the train, it will have to wait for some future opportunity.

CppImport: some notes on the current state of the implementation

I previously wrote that this is done. It is just partially done. There are issues which I documented in the source file as TODOs. These are easily detect by the IDE and I will remember about them, but, I shouldn't write things that aren't true, hiding future work.

Check spelling in Vim

I use Vim for many things. Most importantly to write these words. Since English isn't my mother language I have to check spelling after writing. This I do in the Mozilla Composer. This isn't very productive, so, is it possible to do it in Vim?


Another possibility would be to tweak the Mozilla Composer until it makes the editing as it should be: configurable, respectful of source code formating, support for automation, etc. The main reason I always rejected using these WYSIWYG editors since they make such a bad HTML source! Maybe its the time for me to make them the way I want!

ModelerImpl: starting-up

Now I'm going to start to put the parsed information into the model. After the grammar, this will be the hardest part of this issue. A Modeler implementation must:

  • understand very well the C++ syntax and semantics
  • keep the state of the parsing in a way that when it is called about some parsed construct it models it according to the context
  • understand very well UML
  • model C++ in UML according to guidelines accepted in common by the community –
  • be flexible, i.e., it should be possible to configure it according to the user's preferences


Starting with PluggableImport implementation

So, going ahead with this... I already found some problems with the java generator:

  • The java generator doesn't generate the standard javadocs correctly. I will try to fix this, but, first I must report it in an issue.
  • I must find a way to use the generator with a template, or make it more customizable. If there exist the chance for this I must know about it. The manual doesn't contain description of this possibility. Alas, it doesn't have anything describing specifically the java code generation... I'm going to look in the GeneratorJava source for possible hints... At least from what I see in the generateHeader method, this is marked as a TODO. It seams I have an opportunity to do work both in the java generator and in the C++ generator, which from the readme.txtfile seams to be ahead.

Now, I have the CppImport implementation, but, the module only publicizes the GeneratorCpp class and therefore the org.argouml.application.modules.ModuleLoader doesn't find it. One question remains to be answered: is it possible to have two modules in one jar file? If so, then I may just make the needed adjustments in the manifest file of the C++ module. If not I may have to implement a facade that implements both modules and delegates according to the appropriate methods, but, this will be messy. By adding a new entry group with Name: org/argouml/language/cpp/reveng/CppImport.class to the manifest it already works. Nice :-))

OK, apart from reporting the issues above (and probably working in fixing them), this is finished. So, proceeding with the next step.


I finished today a initial part of fixing the grammar – to provide information from parsing to the Modeler. Instead of proceeding right now with a complete fix I'll address another area of the drop 1 plan, start the implementation of PluggableImport.

But before I'll update the plan according to the execution evolution.

4th version of the plan for C++ reveng drop 1

This is a plan update that is less radical than the previous one, but, reflects the improved knowledge I have on the scope of the problem. I'll update the efforts, both the estimated and the actual until now.

  1. Learn how to make the ANTLR parser for debugging and build one. Estimated Effort (EE) = 4 Mh; Short Name (SN) – ANTLR parser 4 debugging

    2005-02-25 DONE – Although not actually a fully debug enabled parser, but, using the trace capabilities of the ANTLR generator. Note that this makes the parsing much slower and is only usable with small files! Actual Effort (AE) = 2:36

  2. Debug the C++ grammar and make it pass the tests. EE = 20 Mh (see bellow); SN – Fix the C++ grammar

    2005-03-07 PARTIAL – It parses cleanly a simple class and a code snippet with which I was attempting to reproduce the current error in parsing of quadratic.i. AE = 2:51

    2005-05-13 PARTIAL – I've postponed fixing the parsing of quadratic.i, since it seams to be complex to solve. Instead, I've started to fix the grammar in order to provide the parsed information to the Modeler. The test case TestCppGrammar.testGrammarCallbacks2Modeler() shows the partial fix. This problem makes the effort estimation for this to change in a dramatic way, since it introduces a huge amount of work that must be done. New EE = 140 Mh AE = 15:54 Mh

  3. Commit the result of this work and send it to Yolanda. Update the issue. EE 3 Mh; SN – Commit, Yolanda and issue

    2005-03-09 DONE – I committed the work in progress and updated the issue. Due to the release of a stable version of ArgoUML the work was committed in branch cpp_reveng_work_while_0_18_release. I only sent to Yolanda an e-mail of thanks. I'll send her the version that will be sent to the ANTLR list. AE = 1:36

  4. 2nd re-planning of C++ reveng drop 1 EE 2 Mh; SN – 2nd re-planning of C++ reveng drop 1

    2005-03-12 DONE – Re-planned and updated the ProcessDashboard phases. AE = 1:39

  5. Update the model to reflect the new package and the grammar use. EE = 2 Mh; SN –Module model update for the C++ grammar

    2005-03-16 DONE – Updated and committed. AE = 2:14

  6. Make tests that show how the parsed information may be used for reveng. If some issue exist, analyze how it is done in java reveng and fix the grammar as needed. This includes creating, or improving the current, test cases, which prove how the parsed information may be used for reveng. EE 5 Mh; SN – Prove that parsed information is useful for reveng

    2005-03-18 DONE – a big issue exists, as documented in the above step. AE = 1:47 Mh

  7. Planning of C++ reveng drop 1. This task includes all the subsequent plan updates of drop 1, instead of having the planning tasks spread throughout the plan. I estimate I'll have 3 more plan updates, including the final, in the end of drop 1. EE 6:00 Mh; SN – Planning of C++ reveng drop 1

    2005-05-13 PARTIAL – 4th version (3rd update) of the plan. AE = 1:25

  8. Model the implementation of org.argouml.application.api.PluggableImport interface in the C++ reveng module. Generate the realization of the designed classes. If there are issues in the generation, report them in issuezilla. EE 5 Mh; SN – Model and generate the realization of the PluggableImport interface

  9. Close the circle, by making the module support reveng of preprocessed C++ files. EE 15 Mh; SN – Module support of reveng of preprocessed C++

  10. Send a working vanilla version of the grammar to the ANTLR list and announce its use within the ArgoUML project. Provide feedback as appropriate. Automate the adaption of the files in the module build script. EE 5 Mh; SN – Send grammar 2 ANTLR list

  11. Enjoy and celebrate the achievement! Go back to planning next drops. EE 4 Mh; SN – Plan next drops


I have some strange errors when parsing the SimpleClass::newAttr declaration. For some reason it isn't reported from the parser as was the SimpleClass::newOperation declaration...


MDA based programming languages specification and development

While developing the C++ module for ArgoUML I started to think about how many repetitive and low level work I would have to make the yet-to-start Python module. How about the amount of work that would go to maintain the java and classfile modules? This is something that should be better supported within ArgoUML than to just provide a template, e.g., the dummylanguage and existing implementations. We could provide some MDA support for this kind of work, such as the possibility of specifying a PIM of the language support module, with Generator, Importer – for reverse engineering – and Synchronizer – for Round Trip Engineering (RTE) (this one is dependent of the other two being available) – as components. This Platform Independent Model (PIM) would then be transformed into the PSM that would target ANTLR based reverse engineering, a StringTemplate based code generator and a common generic GUI based synchronization. This would make life a lot easier for the development of language support modules and would also, hopefully, provide more reuse opportunities out of this development!

This idea led me to yet another bigger, which is, how about taking a Model Driven Architecture (MDA) approach to the specification and development of programming languages? This would imply having reverse engineering of programming language grammars into a UML profile for language grammars. Then we could define a MDA transformation between the Computation Independent Model (CIM) which is the MOF based grammar model for a given programming language into PIMs that would be based in a UML profile for doing something with the specified language. We could define a PIM for compilers, another for RTE of the language in the context of a Model Driven Design (MDD) tool like ArgoUML, and another for Integrated Development Environment (IDE) support of the tool. These PIMs could then be transformed into PSMs for specific targets, e.g, respectively the Jython compiler, the ArgoUML python module, and the eclipse plugin for python or Jython support within eclipse.

For this idea to work a MOF based domain language must be specified to enable the specification of programming language grammars in a MOF repository. We would also need the support for including semantics of the language constructs to the CIMs. For this we could use a ontology that would be used to enrich the CIMs. This ontology may be based on something existing or something based in the UML spec.


I'm now delving deep into changing the grammar to make the needed info available to the Modeler. The header and implementation files aren't available since the parser works with the translation units therefore the modeling of source file components will be left out for now.


Updated issue #2947 – support for C++ reverse engineering to include the details of the merge and about the option taken retrieve information from the parsing of files.


I have created the following issues for problems I found previously:

I'm going to commit the work in progress to the branch cpp_reveng_work_while_0_18_release and, since the 0.18 release is done, merge the branch into the HEAD. After that I'll continue development in the HEAD.


The interface will be defined outside of the grammar. This will provide a compile time verification between the grammar and what is expected. It will also enable future independence if the current grammar and parser design is replaced. So, lets use ArgoUML a bit to model...


The successful test org.argouml.language.cpp.reveng.TestCppGrammar.testCastExpressions proves that something is wrong in the grammar, but, that this isn't obvious. I can't reproduce the error with a small example of what is failing to be parsed – the CastExpressions.cpp file. Being so, I'll proceed to the next part of the fixing the grammar task, to create a generic modeler interface that is called from the grammar.

Reader Shared items