Archives de catégorie : written in English

What will the wecena community look like when we’ve solved the problem?

[This post is the 3rd part of my draft application process to the Echoing Green (EG) fellowship program. You can help me earn 60.000 US dollars for the take-off of wecena by commenting this post with suggestions about how to best make my case to Echoing Green.]

Once the problem of giving innovative non-profit access to professional IT skills and services at no cost is solved, what will the wecena community look like ? This is the question EG suggests I answer before submitting my application to their fellowship program (see page 8 of their applicant coaching guide). Coach me by commenting my answering attempts below.

EG suggests I start by answering this 1st sub-question :

If your work succeeds, what will the headline in the newspaper say ?

Let’s try such a headline for 01 Informatique or ZDNet :

” Pro bono IT services a critical enabler for major social innovations from education to environment via poverty reduction. ”

Or this one for the French non-profit press (Reporters d’Espoirs anyone ?) :

” An army of computing experts and corporations join the fight for free access to education in villages of the South. French non-profits at the front. ”

Or

” Information technology for social good no more a dream. Low budgets no more an excuse from non-profit boards. ”

For the global press, possibly headlines such as :

” Who’s the best in IT? Exxon Mobil or Greenpeace? Wecena the secret IT weapon for environmental innovations. ”

” U.S. Congress to pass a France-inspired law in favour of pro bono service donations. ”

And ultimately (just for kidding ?) :

” Price of IT pro bono services on the rise at Wall Street. IT shops on the race for CSR awards. ”

2nd sub-question from EG:

If your work succeeds initially and then your organization ceases operations what will the impact on society be ?

My answer :

Wecena’s business model is designed for generating enough profits so that competitors gain an incentive at emerging and replicating our model. Our earlier successes, financial transparency and benefits sharing will prove there is a profitable market for IT pro bono services delivery channels. Several organizations are already well positioned to contribute to such a market: sustainable development consultancies, consulting agencies dedicated to non-profits, philanthropy consultancies, non-profit technology assistance programs. In the end, this will give the most innovative non-profits access to a reliable and cost-efficient source of corporate IT pro bono services.

3rd sub-question for setting goals :

How will you measure the volume of your work? And what goals do you have for each in the short and long-terms?

OK. I’m a bit bad on this one. I can think of indicators of success. But I yet have to specify expected levels of success for these indicators. Too high and I am too optimistic as it would probably exceed our capacity to fund and manage growth. Too low and it would not show the passion there is for this project. Here are the reasonable indicators I am thinking of :

  • number of full-time equivalents (FTE) donated annually by IT corporations to wecena customers (non-profits) as of pro bono service deliveries : 4 FTE in the end of 2009, 10 FTE in the end of 2010, total available market of several hundreds of FTE in France only (woo hoo !)

Next sub-question, probably harder :

How will you measure if your work is making a difference? And what goals do you have for each measure?

My best guess at the moment :

  • median duration of relationships with non-profits (customer retention) : the more they keep accepting donations, the more useful they probably think these donations are : goal = after 2 years of operation (starting from our first operation), expected median duration of at least 6 months for non-profits having accepted first donation more than 1 year ago.
  • volunteer recruitment rate : percentage of IT engineers led to volunteering for “their” non-profit after a pro bono realization with us (rate of volunteering after the end of a wecena mission) => let’s say I’d be very happy if 10% of the wecena engineers kept on contributing at least once two weeks after the end of their individual intervention
  • increased understanding and knowledge of social challenges and innovations by IT employees and managers : => 50% more correct answers to online quizzes proposed by non-profit recipients at the start and at the end of any individual interventions
  • increased understanding and knowledge of IT uses and managements by non-profit members : => 50% more correct answers to online yearly quizzes proposed by us + IT donors.
  • profits (supposed to come with sucess in order to prove there is a market) : at least 5% after 2 years from the start of the 1st operation ?

There would be other indicators to monitor but I am not sure how to collect such data and how to process it so that we isolate our specific contribution :

  • evolution of the percentage of IT service companies donating pro bono services (market donors rate)
  • % of IT pro bono services in non-profit budgets (the more, the merrier)
  • evolution in the perception of technology by non-profit social innovators : we can survey this but I am not sure how to best setup such a survey so that it is reliable, comparable and relevant year after year
  • Cost, outreach and impact depth of social programs powered by our IT pro bono services compared to similar programs not relying on our services => this would be the real evidence of success but I don’t think we can acquire and process such data ; ideally, technology wecena gives access to would make multiply the impact and/or reach of a social innovation by a factor of 10 : ten times more people accessing open education programs, ten times more people with disabilities turning to computers as a daily tool, ten times less effort for homeless people to find a job, etc.
  • Qualitatively, I would like to hear from non-profit boards that recognized social innovators set themselves new social goals because of the technologies wecena can give them access to.

Hey, what to do you think ?

What is the underlying cause of the problem wecena is trying to solve ?

[This post is part of my draft application process to the Echoing Green (EG) fellowship program. You can help me earn 60.000 US dollars for the take-off of wecena.]

EG 2nd pre-application question: Root Cause Analysis: What is the underlying cause of the problem you’re trying to solve ? see page 7 of the applicant coaching guide.

1st sub question :

What are the obvious symptoms / apparent effects of the problem ?

My answer :

There are 2 apparent symptoms or effects of our problem:

  1. there is a lack of technology-based social innovations compared to both existing social challenges and to existing non-social innovations
  2. most non-profit organizations under-exploit the capacity technology can bring them when trying to transform society, organizations and markets

2nd sub-question from EG :

Why is it so ?

My answer :

The cost of IT professionals is much too high for non-profit budgets. IT man-days are too expensive on free markets.

3rd sub-question from EG, trying to get deeper to the root cause of our problem

OK. But why ?

My answer :

IT service companies sell expensive services and donate very limited amounts of pro bono services. They usually ask for payment, even when pro bono services would bring them more indirect value at no cost. These donations could indeed be of no cost for donors in France because of an extremely favorable legal framework.

4th EG sub-question on this, getting even deeper to the root cause :

Why is it so ?

My answer :

Non-profit organizations do not offer IT service companies opportunities to donate pro bono services at low (or no) cost and with the promise of a high social impact that will also bring real value back to the donor in the form of reputation, talent retention and better recruitments of fresh engineers.

5th step suggesteed by EG :

Based on this, what’s the root cause of your problem ?

My answer :

Non-profit organizations don’t have appropriate skills and knowledge for creating, marketing and managing an offering to IT service companies that would present pro bono service donations as a compelling solution for addressing the need these companies have to prove their “corporate social responsability” at low cost. Moreover, most non-profit organizations lack the IT management skills that would allow them to collect, deliver and exploit such pro bono donations at a sustainable cost.

We’ve got our root cause (I think). But EG wants to asks a last :

Why ?

My answer :

Legal frameworks encouraging and facilitating IT pro bono services are nowhere as favorable to donors as in France. But some of these laws were passed recently (2003, updated in 2008) and are still under-used and unknown by many potential donors and recipients. Moreover donor-to-recipient intermediation costs for the delivery of pro bono services are high. And such deliveries are complex to adapt to the business constraints of both corporate donors and non-profit recipients : IT service companies want their engineers and consultant to commit to profit-making long-term missions and can’t afford to miss commercial opportunities because of commitments toward non-profit projects. Competition is fierce and consultant profiles are too often seen as “replaceable” with one another. And because of low budgets, non-profits rarely attain a critical mass of IT needs that would let them justify launching such innovative partnerships.

So what ? EG suggests we end with deriving …

What are the implications for your work ?

My answer :

Wecena proposes an integrated IT pro bono services donation channel that exploits the French labor and tax laws at their best and for the highest benefit of public good. By only focusing on the non-profit needs for IT services and skills, we accumulate experience and efficiency both in how to market our donation proposal to the IT industry and in how to let non-profits exploit such donations at their best, despite their lack of IT management skills. This focus is also critical to maintaining relatively low intermediation costs.

More precisely, IT donors aren’t asked to commit individual consultants to a given project but to commit to a certain amount of service donation to non-profits. This can generate extremely high staff turnover (individual consultants leave projects as soon as they are assigned to a new commercial mission). But this allows projects to benefit from a continuous flow of skilled professionals to be then turned into online volunteers. As a consequence of this, projects have to be carefully selected on the basis of how resilient they can be to extreme staff turnover. And corresponding project management tools and methods must be offered to non-profit leaders.

Help me earn 60.000 USD for wecena

Wecena services are my new social venture. The US-based Echoing Green non-profit organization helps social entrepreneurs with a 2 years fellowship program and seed grants including 60 000 USD for the take-off of high-social impact projects. You can think of them as a social venture fund. I’d like to apply to this project competition so that the wecena concept succeeds at bringing corporate-grade information technology to the hands of the most innovative non-profits in France and all around the world. The deadline for this year applications is December 1st, 2008.

Readers, I need your help.

You can help by reviewing the next posts on my blog (I will use the “echoinggreen” tag, you can use this links for follow-up posts). I will post pieces of my draft application to the Echoing Green fellowship program. You can help if you are an English speaker (possibly native…) : I need you to correct my English language and style. You can help if you feel concerned with the importance of information technology and the Internet for serving the public with high-impact and broad-reach social innovations : I need you to help me making the case to Echoing Green. You can help if you like my project and would like to contribute one way or another : just tell me you support this whole stuff and share any comment or thought. If you have a couple of hours available for helping, you can even start by reading Echoing Green’s applicant coaching guide.

You can contribute in English (preferred) or in French (ça ira tout aussi bien). In case you are reading this from the wecena.com website, note that your comments have to be posted on my personal blog (link below).

Let’s start with the 1st pre-application question

Let’s start with Echoing Green (EG) ‘s pre-application tools. EG suggests applicants (me) should use their questions to prepare their application. Let’s try with the 1st question (page 6 of the coaching guide) and throw an answer…

Problem Definition, What specific problems are you focused on and can you realistically solve it?

Their 1st sub-question :

What specific injustice in the world have you seen that compels you to start a new social change organization ?

My answer:

Non-profit social innovators lack access to corporate-grade Information Technology (IT) skills and services. This is a form of digital divide between non-profit and for-profit innovators. Why would information technology be primarily made to buy more stuff or spread more advertisement ? Why isn’t it more importantly made and used for fighting poverty, overcoming disabilities, sharing education or enhancing public health ?

2nd sub-question from EG :

Who, specifically, is hurt or affected by this injustice and how does the injustice manifest itself ?

My answer:

Beneficiaries of all fields of social innovations suffer from the lack of technology-powered social innovations and from the under-exploitation of technology by non-profits. Had non-profit innovators been given resources to better use technology, the reach of their programs would have been extended, their ability to transform organizations, markets and society would have been increased. More beneficiaries would have been helped better and earlier. Unseen social innovations would have been launched and developped.

EG 3rd question for defining the problem:

Is it realistic that a single organization could address this injustice ? if not, define the problem more narrowly ?

No. We only focus on the access by French non-profits to significant amounts of IT skills and services. Accessing software or hardware is out of our scope. We also only focus on IT needs that represent at least one full-time equivalent of services and skills. Smaller needs and projects won’t be supported (at least not immediately). Direct help to foreign non-profits is not in our immediate scope but we are considering partnerships with foreign non-profits in the need of free IT skills and services when it can increase the social efficiency of our effort by supporting higher impact global social innovations.

That’s it. What do you think ?

Very long-term backup fabbed with a reprap ?

How will your personal data be readable 2.000 years from now ? The Long Now Foundation blogs about a nickel-based 3 inches-large disk that can reliably hold high amount of printed data for at least 2.000 years. Data is printed on it in small font : a 750-power optical microscope is required to read the pages !

On the other side of the blogosphere, the reprap community considers adding an ElectroChemical Milling (ECM) tool head to their home DIY 3D printers :

With this tool head, it could machine any conductive material, regardless of how hard it is or how high its melting point.

Maybe someday, personal very-long term backups will be printed at home ?

At the moment, industrial ECM/EDM machines can “achieve a one micron positional accuracy and wire EDM walls as thin as 0.010” (.254mm)” or (ECM) make holes/traces  as small as 0.2 mm large. I guess some progress is required before 750-power optical microscopes are required for reading data printed with this technology. But maybe that before 2.000 years from now, ultra-precision will be achieved by fabbers ? Id be curious of knowing which technique was used by the Long Now Foundation project and how difficult it would be to port this technique to the wonderful world of fabbers.

Rapid prototyping with microcontrollers ?

I have no clue about micro-electronics and embedded systems. I am a Web application architect and developer, working with very high-level programming languages such as Python (or Perl or Java). I hardly remember assembly language from my childhood experiments with an Apple IIe and almost never touched C or C++. But I have been dreaming lately of rapid-prototyping some advanced non-Web application in an embedded system using my programming skills. So I thought I could share bits of my ignorance here. Please bear with me and give me some hints in order for me to best get out of darkness ! :)

Microcontrollers are now gaining capabilities that are comparable to microprocessors of early personal computers. The two most popular microcontroller (uC) series are Microchip PIC uCs and Atmel AVR uCs. For instance the PIC18F25J10-I/SO costs around 3 or 5 euros per unit at Radio Spares (I am in France: think RS in the UK or Allied Electronics in the USA). It has the following characteristics: 40 MHz, RS-232 capabilities (serial port), a “C compiler optimized architecture”, 48 kB of program memory (Flash mem) and around 4 or 5 kB of data memory (SRAM + EEPROM).

There are nice peripherals available, too. For instance this Texas Instrument CC2500 2.4GHz RF data transceiver (= transmitter + receiver) at around 2 to 3 euros per unit or current sensors approximately at the same price. In fact, periphals possibilities are limitless…

For free software hackers, there was a linux version for such chips : uCLinux. But is it still an active project ? I think I read that the comon linux kernel now includes everything that is required for it to run in embedded sytems. What about GNU utilities ? I know there are things like busybox on bigger but still embedded processors (phones). Anything equivalent on microcontrollers ?

There are simulators that will… let you pretend your desktop computer has a microcontroller inside, or sort of. :)

There is at least one C library for microcontrollers. C is considered as a “high-level programming language” in the embeddeds world ! That is to say that assembly language has been the norm. Some higher-levels languages can be used with microcontrollers, including some exotic-to-me Pascal-like languages like XPlo or PMP or Java-like but living dead things like Virgil and… what about my beloved Python ?

There are at least 2 projects aiming at allowing Python-programming on microcontrollers. pyastra is a “Python assembler translator” that can be used with some PIC12, PIC14 and PIC16 uCs. But it looks dead. Pymite looks sexier but not much more active :

PyMite is a flyweight Python interpreter written from scratch to execute on 8-bit and larger microcontrollers with resources as limited as 64 KiB of program memory (flash) and 4 KiB of RAM. PyMite supports a subset of the Python 2.5 syntax and can execute a subset of the Python 2.5 bytecodes. PyMite can also be compiled, tested and executed on a desktop computer.

At the moment, it seems like Python programming on microcontrollers is a dead end. Nothing worth investing time and efforts unless you want to also have to maintain a Python compiler… Same may be true for Java, not mentioning Perl. In fact, it seems to me that the object-oriented crowds are too far from microcontrollers applications to generate enough interest in initiatives such as Pymite, at the moment. Oh, and I am knowingly ignoring C++ which I did not investigate, having no experience in C++.

So what is left in terms of (open source) programming languages that would be of higher level than C ? The best guess I can make is Great Cow Basic, which is a free software Basic (procedural) language. Example programs look nice to me. It has been active recently. And it supports most of the chips I would consider experimenting with.

Next steps for me, I guess, would be to pick a PIC simulator and an IDE for Great Cow Basic (any eclipse plugin ?). Then I will probably have to figure out how a Basic program can be executed on a simulated PIC. And how a PIC simulator can be useful without all of the electronics that would surround it in any real setup. I’ll see that. When I have time to pursue my investigations and experiments in this micro-world.

And piclist is a great site for beginners.

3D scannerless scanning for fabbers

For several weeks (or more), I have been dreaming of the day I’ll get my hands on a Reprap (self-parts-printing 3D desktop printer, a DIY fabber). I have been lucky enough to have a good friend promise me he would give his free time for assembling such a printer for me as long as I pay for the parts. 3 days of work are required to assemble the parts which you can order via the web in case you don’t already have access to such a reprap, which is my case. I will try to wait for the next major release of Reprap, namely Mendel 2.0 (current version = Darwin 1.0) unless I can’t resist temptation long enough…

Anyway, I have mainly been dreaming of possible applications of fabbers. Their use is extremely competitive (and disruptively innovative) as soon as you want to print customized 3D shapes which can’t be bought from the mass-manufacturing market. For instance, a reprap is cool when you want to print a chocolate 3D version of your face (see the Fab@Home project) or a miniature plastic representation of your home or anything that has a shape which is very specific to your case (not to mention the future goal of printing 90% of complex systems such as robots, portable electronic devices including phones and… fabber-assembling robots…). And this is where 3D scanning is a must : with a 3D scanner, you can scan an existing object and build a 3D model from it which you can then modify and print at the scale you want.

So my dreams lead me to this question : I could get a fabber some time soon but how to also get a desktop 3D scanner ? Some people have already started hacking home 3D scanners. But I had also heard of techniques that allow users to build 3D models from existing objects using either a single picture of the object, 2 pictures, several images or even a small movie. Some techniques require that the parameters of the camera(s) are known (position, angles, distance, …). Some techniques require 2 cameras in a fixed and known setup (stereophotography). Some techniques require that the camera is fixed and the object lies on a turntable. I really know nothing about computer vision and the world of 3D techniques so I was happy to learn new words such as “close-range photogrammetry“, “videogrammetry“, “structure from motion“, “matchmoving“, “motion tracking” (which is the same as matchmoving) or “3D reconstruction“. After some Web wandering, I identified several open source (of course) software packages that could offer some workable path from existing physical objects to 3D models of them using plain cameras or video cameras.

The idea would be the following :

  1. you take an existing, very personal object, for instance your head !
  2. with a common digital camera, you take pictures of your head from several angles
  3. you load these pictures into your favorite 3D reconstruction free software package
  4. it creates a 3D model of your head which you can then export to a 3D editor for possible adjustments (think Blender)
  5. you export your corrected 3D model into the reprap software stuff
  6. your reprap fabs your head out of plastic (or chocolate ?)

Here are the software projects I identified :

  • From a single image :
    • jSVR, Single View Reconstruction, a semi-automatic process for identifying and exporting three-dimensional information from a single un-calibrated image, dead project ?
  • Using a turntable :
  • From stereo images :
  • From a movie or a sequence of pictures :
    • e-Foto, a free GNU/GPL educational digital photogrammetric workstation, but is it suitable for close-range photogrammetry ?
    • Voodoo Camera Tracker, a tool for the integration of virtual and real scenes, estimates camera parameters and reconstructs a 3D scene from image sequences ; oops, this is not free software but freeware only
    • Octave vision, Algorithms for the recovery of structure and motion, using Octave, a one-shot development, no future…
    • Tracking / Structure from Motion, another piece of student homework
    • libmv, a structure from motion library, which plans to one day take raw video footage or photographs, and produce full camera calibration information and dense 3D models, very promising but being rewritten at the moment (August 2008)
    • GPU KLT a high-performance research implementation
  • Using the shadow of a stick (!) :
    • Scanning with Shadows (see also this site), wave a stick in front of a light source to cast a shadow on the object of interest, and figure out its 3D shape by observing the distortion of the shadow
  • Don’t know which technique is used :
    • OpenCV (see also this site), Intel’s Open Computer Vision library may some day contain some 3D reconstruction capabilities
    • Voxelization, a .NET based framework, designed for helping in development of different volume reconstruction, 3D voxel visualization and color consistency algorithms in multi view dynamic scenes, dead project ?

My personal conclusion :

I haven’t tested any of these packages. At the moment, there seems to be no easy-to-use free software package that would compare to commercial stuff such as Photomodeler or ImageModeler or research works such as Microsoft Photosynth. However these techniques and algorithms seem to be mature enough to become present as open source package soon, especially given the emerging interest in 3D scanning for fabbers ! Most promising free packages for scannerless 3D scanning for fabbers are probably Stereo and libmv.

What do you think ?

Comparator

Comparator is a small Plone product I recently hacked for my pleasure. It’s called comparator until it gets a nicer name, if ever. I distribute it here under the GNU General Public License. It allows users to select any existing content type (object class) and to calculate a personnalized comparison of the instances of this class. For example, if you choose to compare “News Items”, then you select the news items properties you want to base your comparison upon (title, creation date, description, …). You give marks to any value of these properties (somewhat a tedious process at the moment but much room for improvement in the future, there). Comparator then let’s you give relative weights to these properties so that the given marks are processed and the compared instances are ranked globally.

It’s a kind of basic block for building a comparison framework, for building Plone applications that compare stuff (any kind of stuff that exists within your portal, including semantically agregated stuff). Let’s say that your Plone portal is full of descriptions of beers (with many details about all kinds of beers). Then adding a comparator to your portal will let your users give weights to every beer property and rank all the beers according to their personal tastes.

Comparator is based on Archetypes and was built from an UML diagram with ArchgenXML. Comparator fits well in my vision of semantic agregation. I hope you can see how. Comments welcome !

Daisy vs. Plone, feature fighting

A Gouri-friend of mine recently pointed me to Daisy, a “CMS wiki/structured/XML/faceted” stuff he said. I answered him it may be a nice product but not enough attractive for me at the moment to spend much time analyzing it. Nevertheless, as Gouri asked, let’s browse Daisy’s features and try to compare them with Plone equivalents (given that I never tried Daisy).

The Daisy project encompasses two major parts: a featureful document repository

Plone is based on an object-oriented repository (Zope’s ZODB) rather than a document oriented repository.

and a web-based, wiki-like frontend.

Plone has its own web-based fronted. Wiki features are provided with an additional product (Zwiki).

If you have different frontend needs than those covered by the standard Daisy frontend, you can still benefit hugely from building upon its repository part.

Plone’s frontend is easily customizable either with your own CSS, with inherting from existing ZPT skins or with a WYSIWYG skin module such as CPSSkin.

Daisy is a Java-based application

Plone is Python-based.

, and is based on the work of many valuable open source packages, without which Daisy would not have been possible. All third-party libraries or products we redistribute are unmodified (unforked) copies.

Same for Plone. Daisy seems to be based on Cocoon. Plone is based on Zope.

Some of the main features of the document repository are:
* Storage and retrieval of documents.

Documents are one of the numerous object classes available in Plone. The basic object in Plone is… an object that is not fully extensible by itself unless it was designed to be so. Plone content types are more user-oriented than generic documents (they implement specialized behaviours such as security rules, workflows, displays, …). They will be made very extensible when the next versions of the “Archetypes” underlying layer is released (they include through-the-web schema management feature that allow web users to extend what any existing content type is).

* Documents can consists of multiple content parts and fields, document types define what parts and fields a document should have.

Plone’s perspective is different because of its object orientation. Another Zope product called Silva is more similar to Daisy’s document orientation.

Fields can be of different data types (string, date, decimal, boolean, …) and can have a list of values to choose from.

Same for Archetypes based content types in Plone.

Parts can contain arbitrary binary data, but the document type can limit the allowed mime types. So a document (or more correctly a part of a document) could contain XML, an image, a PDF document, … Part upload and download is handled in a streaming manner, so the size of parts is only limitted by the available space on your filesystem (and for uploading, a configurable upload limit).

I imagine that Daisy allows the upload and download of documents having any structure, with no constraint. In Plone, you are constrained by the object model of your content types. As said above this model can be extended at run time (schema management) but at the moment, the usual way to do is to define your model at design time and then comply with it at run time. At run time (even without schema management), you can still add custom metadata or upload additional attached files if your content type supports attached files.

* Versioning of the content parts and fields. Each version can have a state of ‘published’ or ‘draft’. The most recent version which has the state published is the ‘live’ version, ie the version that is displayed by default (depends on the behaviour of the frontend application of course).

The default behaviour of Plone does not include real versioning but document workflows. It means that a given content can be in state ‘draft’ or ‘published’ and go from one state to another according to a pre-defined workflow (with security conditions, event triggering and so). But a given object has only one version by default.
But there are additional Plone product that make Plone support versioning. These products are to be merged into Plone future distribution because versioning has been a long awaited feature. Note that, at the moment, you can have several versions of a document to support multi-language sites (one version per language).

* Documents can be marked as ‘retired’, which makes them appear as deleted, they won’t show up unless explicitely requested. Documents can also be deleted permanently.

Plone’s workflow mechanism is much more advanced. A default workflow includes a similar retired state. But the admin can define new workflows and modify the default one, always referring to the user role. Plone’s security model is quite advanced and is the underlying layer of every Plone functionality.

* The repository doesn’t care much what kind of data is stored in its parts, but if it is “HTML-as-well-formed-XML”, some additional features are provided:
o link-extraction is performed, which allows to search for referers of a document.
o a summary (first 300 characters) is extracted to display in search results
o (these features could potentially be supported for other formats also)

There is no such thing in Plone. Maybe in Silva ? Plone’s reference engine allows you to define associations between objects. These associations are indexed by Plone’s search engine (“catalog”) and can be searched.

* all documents are stored in one “big bag”, there are no directories.

Physically, the ZODB repository can have many forms (RDBMS, …). The default ZODB repository is a single flat file that can get quite big : Data.fs

Each document is identified by a unique ID (an ever-increasing sequence number starting at 1), and has a name (which does not need to be unique).

Each object has an ID but it is not globally unique at the moment. It is unfortunately stored in a hierarchical structure (Zope’s tree). Some Zope/Plone developpers wished “Placeless content” to be implemented. But Daisy must still be superior to Plone in that field.

Hierarchical structure is provided by the frontend by the possibility to create hierarchical navigation trees.

Zope’s tree is the most important structure for objects in a Plone site. It is too much important. You can still create navigation trees with shortcuts. But in fact, the usual solution in order to have maximum flexibility in navigation trees is to use the “Topic” content type. Topics are folder-like object that contain a dynamic list of links to objects matching the Topic’s pre-defined query. Topic are like persistent searches displayed as folders. As a an example a Topic may display the list of all the “Photo”-typed objects that are in “draft” state in a specific part (tree branch) of the site, etc.

* Documents can be combined in so-called “collections”. Collections are sets of the documents. One document can belong to multiple collections, in other words, collections can overlap.

Topics too ? I regret that Plone does easily not offer a default way to display a whole set of objects in just one page. As an example, I would have enjoyed to display a “book” of all the contents in my Plone site as if it were just one single object (so that I can print it…) But there are some Plone additional products (extensions) that support similar functionalities. I often use “Content Panels” to build a page by defining its global layout (columns and lines) and by filling it with “views” from exisiting Plone objects (especially Topics). Content Panels mixed with Topics allow a high flexibility in your site. But this flexibility has some limits too.

* possibility to take exclusive locks on documents for a limitted or unlimitted time. Checking for concurrent modifications (optimistic locking) happens automatically.

See versioning above.

* documents are automatically full-text indexed (Jakarta Lucene based). Currently supports plain text, XML, PDF (through PDFBox), MS-Word, Excel and Powerpoint (through Jakarta POI), and OpenOffice Writer.

Same for Plone except that Plone’s search engine is not Lucene and I don’t know if Plone can read OpenOffice Writer documents. Note that you will require additional modules depending on your platform in order to read Microsoft files.

* repository data is stored in a relation database. Our main development happens on MySQL/InnoDB, but the provisions are there to add support for new databases, for example PostgreSQL support is now included.

Everything is in the ZODB. By default stored as a single file. But can also be stored in a relational database (but this is usually useless). You can also transparently mix several repositories in a same Plone instance. Furthermore, instead of having Plone directly writing in the ZODB’s file, you can configure Plone so that it goes through a ZEO client-server setup so that several Plone instances can share a common database (load balancing). Even better, there is a commercial product, ZRS, that allows you to transparently replicate ZODBs so that several Plone instances setup with ZEO can use several redundant ZODBs (no single point of failure).

The part content is stored in normal files on the file system (to offload the database). The usage of these familiar, open technologies, combined with the fact that the daisywiki frontend stores plain HTML, makes that your valuable content is easily accessible with minimal “vendor” lock-in.

Everything’s in the ZODB. This can be seen as a lock-in. But it is not really because 1/ the product is open source and you can script a full export with Python with minimal effort, 2/ there are default WebDAV + FTP services that can be combined with Plone’s Marshall extension (soon to be included in Plone’s default distribution) that allows you to output your content from your Plone site. Even better, you can also upload your structured semantic content with Marshall plus additional hacks as I mentioned somewhere else.

* a high-level, sql-like query language provides flexible querying without knowing the details of the underlying SQL database schema. The query language also allows to combine full-text (Lucene) and metadata (SQL) searches. Search results are filtered to only contain documents the user is allowed to access (see also access control). The content of parts (if HTML-as-well-formed-XML) can also be selected as part of a query, which is useful to retrieve eg the content of an “abstract” part of a set of documents.

No such thing in Plone as far as I know. You may have to Pythonize my friend… Except that Plone’s tree gives an URL to every object so that you can access any part of the site. But not with a granularity similar to Daisy’s supposed one. See silva for more document-orientation.

* Accesscontrol: instead of attaching an ACL to each individual document, there is a global ACL which allows to specify the access rules for sets of documents by selecting those documents based on expressions. This allows for example to define access control rules for all documents of a certain type, or for all documents in a certain collection.

Access control is based on Plone’s tree, with inheritance (similar to Windows security model in some way). I suppose Plone’s access control is more sophisticated and maintainable than Daisy’s one but it should require more investigation to explain why.

* The full functionality of the repository is available via an HTTP+XML protocol, thus providing language and platform independent access. The documentation of the HTTP interface includes examples on how the repository can be updated using command-line tools like wget and curl.

Unfortunately, Plone is not ReST enough at the moment. But there is some hope the situation will change with Zope 3 (Zope’s next major release that is coming soon). Note that Zope (so Plone) supports HTTP+XML/RPC as a generic web service protocol. But this is nothing near real ReSTful web services…

* A high-level, easy to use Java API, available both as an “in-JVM” implementation for embedded scenarios or services running in the daisy server VM, as well as an implementation that communicates transparently using the HTTP+XML protocol.

Say Python and XML/RPC here.

* For various repository events, such as document creation and update, events are broadcasted via JMS (currently we include OpenJMS). The content of the events are XML messages. Internally, this is used for updating the full-text index, notification-mail sending and clearing of remote caches. Logging all JMS events gives a full audit log of all updates that happened to the repository.

No such mechanism as far as I know. But Plone of course offers fully detailed audit logs of any of its events.

* Repository extensions can provide additional services, included are:
o a notification email sender (which also includes the management of the subscriptions), allowing subscribing to individual documents, collections of documents or all documents.

No such generic feature by default in Plone. You can add scripts to send notification in any workflow transition. But you need to write one or two lines of Python. And the management of subscriptions is not implemented by default. But folder-like object support RSS syndication so that you can agregate Plone’s new objects in your favorite news aggregator;

o a navigation tree management component and a publisher component, which plays hand-in-hand with our frontend (see further on)

I’ll see further on… :)

* A JMX console allows some monitoring and maintenance operations, such as optimization or rebuilding of the fulltext index, monitoring memory usage, document cache size, or database connection pool status.

You have several places to look at for this monitoring within Zope/Plone (no centralized monitoring). An additional Plone product helps in centralizing maintenance operations. Still some ground for progress here.

The “Daisywiki” frontend
The frontend is called the “Daisywiki” because, just like wikis, it provides a mixed browsing/editing environment with a low entry barrier. However, it also differs hugely from the original wikis, in that it uses wysiwyg editing, has a powerful navigation component, and inherits all the features of the underlying daisy repository such as different document types and powerful querying.

Well, then we can just say the same for Plone and rename its skins the Plonewiki frontend… Supports Wysiwyg editing too, with customizable navigation tree, etc.

* wysiwyg HTML editing
o supports recent Internet Explorer and Mozilla/Firefox (gecko) browsers, with fallback to a textarea on other browsers. The editor is customized version of HTMLArea (through plugins, not a fork).

Same for Plone (except it is not an extension of HTMLArea but of a similar product).

o We don’t allow for arbitrary HTML, but limit it to a small, structural subset of HTML, so that it’s future-safe, output medium independent, secure and easily transformable. It is possible to have special paragraph types such as ‘note’ or ‘warning’. The stored HTML is always well-formed XML, and nicely layed-out. Thanks to a powerful (server-side) cleanup engine, the stored HTML is exactly the same whether edited with IE or Mozilla, allowing to do source-based diffs.

No such validity control within Plone. In fact, the structure of a Plone document is always valid because it is managed by Plone according to a specific object model. But a given object may contain an HTML part (a document’s body as an example) that may not be valid. If your documents are to have a recurrent inner structure, then you are invited to make this structure an extension of an object class so that is no more handled as a document structure. See what I mean ?

o insertion of images by browsing the repository or upload of new images (images are also stored as documents in the repository, so can also be versioned, have metadata, access control, etc)

Same with Plone except for versioning. Note that Plone’s Photo content type support automatic server-side redimensioning of images.

o easy insertion document links by searching for a document

Sometimes yes, sometimes no. It depends on the type of link you are creating.

o a heartbeat keeps the session alive while editing

I don’t know how it works here.

o an exlusive lock is automatically taken on the document, with an expire time of 15 minutes, and the lock is automatically refreshed by the heartbeat

I never tried the Plone extension for versioning so I can’t say. I know that you can use the WebDAV interface to edit a Plone object with your favorite text processing package if you want. And I suppose this interface properly manages this kind of issues. But I never tried.

o editing screens are built dynamically for the document type of the document being edited.

Of course.

* Version overview page, from which the state of versions can be changed (between published and draft), and diffs can be requested. * Nice version diffs, including highlighting of actual changes in changed lines (ignoring re-wrapping).

You can easily move any object in its associated workflow (from one state to another, through transitions). But no versioning. Note that you can use Plone’s wiki extension and this extension supports supports diffs and some versioning features. But this is not available for any Plone content type.

* Support for includes, i.e. the inclusion of one document in the other (includes are handled recursively).

No.

* Support for embedding queries in pages.

You can use Topics (persistent queries). You can embed them in Content Panels.

* A hierarchical navigation tree manager. As many navigation trees as you want can be created.

One and only one navigation tree by default. But Topics can be nested. So you can have one main navigation tree plus one or more alternatives with Topics (but these alternatives are limited for some reasons.).

Navigation trees are defined as XML and stored in the repository as documents, thus access control (for authoring them, read access is public), versioning etc applies. One navigation tree can import another one. The nodes in the navigation tree can be listed explicitely, but also dynamically inserted using queries. When a navigation tree is generated, the nodes are filtered according to the access control rules for the requesting user. Navigation trees can be requested in “full” or “contextualized”, this last one meaning that only the nodes going to a certain document are expanded. The navigtion tree manager produces XML, the visual rendering is up to XSL stylesheets.

This is nice. Plone can not do that easily. But what Plone can do is still done with respect to its security model and access control, of course.

* A navigation tree editor widget allows easy editing of the navigation trees without knowledge of XML. The navigation tree editor works entirely client-side (Mozilla/Firefox and Internet Explorer), without annoying server-side roundtrips to move nodes around, and full undo support.

Yummy.

* Powerful document-publishing engine, supporting:
o processing of includes (works recursive, with detection of recursive includes)
o processing of embedded queries
o document type specific styling (XSLT-based), also works nicely combined with includes, i.e. each included document will be styled with its own stylesheet depending on its document type.

OK

* PDF publishing (using Apache FOP), with all the same features as the HTML publishing, thus also document type specific styling.

Plone document-like content type offer PDF views too.

* search pages:
o fulltext search
o searching using Daisy’s query language
o display of referers (“incoming links”)

Fulltext search is available. No query language for the user. Display of refers is only available for content type that are either wiki pages or have been given the ability to include references from other objects.

* Multiple-site support, allows to have multiple perspectives on top of the same daisy repository. Each site can have a different navigation tree, and is associated with a default collection. Newly created documents are automatically added to this default collection, and searches are limited to this default collection (unless requested otherwise).

It might be possible with Plone but I am not sure when this would be useful.

* XSLT-based skinning, with resuable ‘common’ stylesheets (in most cases you’ll only need to adjust one ‘layout’ xslt, unless you want to customise heavily). Skins are configurable on a per-site basis.

Plone’s skins are using the Zope Page Templates technology. This is a very nice and simple HTML templating technology. Plone’s skins make an extensive use of CSS and in fact most of the layout and look-and-feel of a site is now in CSS objects. These skins are managed as objects, with inheritance, overriding of skins and other sophisticated mechanism to configure them.

* User self-registration (with the possibility to configure which roles are assigned to users after self-registration) and password reminder.

Same is available from Plone.

* Comments can be added to documents.

Available too.

* Internationalization: the whole front-end is localizable through resource bundles.

Idem.

* Management pages for managing:
o the repository schema (the document types)
o the users
o the collections
o access control

Idem.

* The frontend currently doesn’t perform any caching, all pages are published dynamically, since this also depends on the access rights of the current user. For publishing of high-trafic, public (ie all public access as the same user), read-only sites, it is probably best to develop a custom publishing application.

Zope includes caching mechanisms that take care of access rights. For very high-trafic public sites, a Squid frontend is usually recommended.

* Built on top of Apache Cocoon (an XML-oriented web publishing and application framework), using Cocoon Forms, Apples (for stateful flow scenarios), and the repository client API.

By default, Zope uses its own embedded web server. But the usual setup for production-grade sites is to put an Apache reverse-proxy in front of it.

My conclusion : Daisy looks like a nice product when you have a very document-oriented project, with complex documents with structures varying much from documents to documents ; its equivalent in Zope’s world would be Silva. But Plone is much more appropriate for everyday CMS sites. Its object-orientation offers both a great flexibility for the developer and more ease of use for Joe-six-pack webmaster. Plone still lacks some important technical features for its future, namely ReSTful web service interfaces, plus placeless content paradigm. Versioning is expected soon.

This article was written in just one raw, late at night and with no re-reading reviewed once thanks to Gouri. It may be wrong or badly lacking information on some points. So your comments are much welcome !

From OWL to Plone

I found a working path to transform an OWL ontology into a working Plone content-type. Here is my recipe :

  1. Choose any existing OWL ontology
  2. With Protege equipped with its OWL plugin, create a new project from your OWL file.
  3. Still within Protege, with the help of its UML plugin, convert your OWL-Protege project into a UML classes project. You get an XMI file.
  4. Load this XMI file into an UML project with Poseidon. Save this project under the .zuml Poseidon format.
  5. From poseidon, export your classes a new xmi file. It will be Plone-friendly.
  6. With a text editor, delete some accentuated characters that Poseidon might have added to your file (for example, the Frenchy Poseidon adds a badly accentuated “Modele sans titre” attribute into your XMI) because the next step won’t appreciate them
  7. python Archgenxml.py -o YourProduct yourprojectfile.xmi turns your XMI file into a valid Plone product. Requires Plone and Archetypes (see doc) latest stable version plus ArchgenXML head from the subversion repository.
  8. Launch your Plone instance and install YourProduct as a new product from your Plone control panel. Enjoy YourProduct !
  9. eventually populate it with an appropriate marshaller.

Now you are not far from using Plone as a semantic aggregator.

The CMS pseudo-stock market

The Drupal people produced insightful stock-market-like statistics about the popularity of open source CMS packages (via the precious Amphi-Gouri). But their analysis mixes content management systems (Drupal, Plone) with blog engines (WordPress) and bulletin boards (phpBB). Anyway, it shows that :

  • The popularity of most Free and Open Source CMS tools is in an upward trend.
  • Bulletin boards like phpBB is the most popular category, maybe the most mature and phpBB is the strong leader in this category
  • In the CMS category, Mambo, Xoops, Drupal and Plone are direct competitors ; Mambo is ahead in terms of popularity, Plone is behind its PHP competitors which certainly benefit from the popularity of PHP compared to Python; PHP-Nuke and PostNuke are quickly loosing some ground.
  • WordPress is the most dynamic open source blog engine in terms of growth of popularity ; its community is exploding

My conclusion :

  • if you want an open source bulletin board/community forum, then choose phpBB with no hesitation
  • if you want a real content management system and are not religiously opposed to Python, then choose Plone, else stick with PHP and go Mambo (or Xoops ?)
  • if you want an open source blog engine, then enjoy WordPress

If feel like producing this kind of statistical analysis about the dynamics of open source communities is extremely valuable for organization and people considering several open source options (cf. the activity percentile indicated on sourceforge projets as an example). I would tend to say that the strength of an open source community, measured in term of growth and size, is the one most important criteria to rely on when choosing an open source product.

Nowadays, the (real) stock market relies strongly on rating agencies. There must be a room (and thus a business opportunity) for an open source rating agency that would produce strong evidences about the relative strength of project communities.

What do you think ?

Zemantic: a Zope Semantic Web Catalog

Zemantic is an RDF module for Zope (read its announcement). From what I read (not tested by me yet), it implements services similar to zope catalogs and enables universal management of references (such as the Archetypes reference engine but in a more sustainable way). It is based on RDFLib, similarly to ROPE.

I feel enthusiastic about this product since it sounds to me like a good future-proof solution for the management of metadata, references and structured data within content management systems and portals. Plus Zemantic would sit well in my vision of Plone as a semantic aggregator.

This is a test about WordPress

This is a test related to my post on WordPress support forum (see below). You can ignore this message or read the comments in order to follow the results of this test.

Ampersands escaped in URLs within comments

On my blog (http://sig.levillage.org/) equipped with WordPress 0.72, when someone posts a comment containing an HTML link with an URL containing an ampersand, this URL gets mangled… Some characters (like the &) seem to be systematically escaped. IMO, this is a bug (not a feature). The escaping function should not escape ampersands in URL when the URL is a value of a tag attribute. What do you think ? Did I miss something ? Is there currently a workaroung ?

— Sig
http://sig.levillage.org

Zope and Plone learning roadmap

It is sometimes said that the art of mastering Zope and Plone is difficult. It has also been said that learning Zope Zen involves a steep learning curve. I have also read many newbies (like me) asking for information about the first steps to go through in order to smoothly get into Zope development. “Don’t start learning Zope before you know Python !”, “No need for mastering in TAL, TALES and METAL for building Plone user interfaces, you’d rather learn advanced CSS techniques”, or the like… So I wonder : what is the recommended roadmap for learning Zope and Plone ? how to make the global learning curve smoother or just a little bit more visible and manageable ? So the diagram below is my guess on the ideal learning roadmap for a would-be master in Zope+Plone :
The ideal roadmap for learning Zope and Plone.

Innovation technologique au service du développement durable (suite)

J’avais signalé ici ce rapport sur la place de l’innovation technologique dans les politiques de développement durable des entreprises. J’en retiens également les quelques points particuliers suivants :

  • La gestion de l’environnement est un argument en faveur des stratégies d’entreprises visant à développer des offres de services autour d’offres de produits existantes.
  • La communication des entreprises en matière de développement durable relève soit d’une activité de marketing innovante lorsqu’elle est proactive soit plutôt d’une activité de lobbying lorsqu’il s’agit de défendre certains intérêts économiques de l’entreprise.
  • Pour une entreprise, parmi les motivations à innover, le développement durable ne figure pas parmi les priorités.
  • Les dirigeants d’entreprise peuvent difficilement convaincre les actionnaires de la rentabilité d’une stratégie de développement durable sans une intervention publique qui aille explicitement dans ce sens.
  • Les stratégies d’innovation observées diffèrent selon la taille de l’entreprise : “une grande entreprise pourra définir une stratégie à long terme, mobiliser ses ressources en R&D, améliorer sa communication interne et externe et pratiquer le lobbying tandis qu’une petite enteprise préfèrera investir dans des innovations plus pointues ou des niches de marché et mobiliser la créativité de l’ensemble du personnel”.
  • une technologie au service du développement durable doit à la fois être propre (ne pas porter atteinte à l’environnement) et sobre (consommer peu de ressources).

rdflib and ROPE

I just blog this Bob DuCharme article so that I can remember where practical information about rdflib can be read.
By the way, I have tested the very pre-pre-release of Reinout’s ROPE (see ROPE = Rdflib + zOPE). And I could install it on a Zope 2.7.0b3 fresh install. It was quite easy to install it. But, as Reinout said, it is still a very early release and, as you can see on the attached screenshot, there is much work to be done before it is really usable. This screenshot is just a hint for Reinout : if you add several namespaces into a rdfstore (shouldn’t you name it a “rope” ?), they end up displayed one next to another instead of being options of the same HTML select widget. Anyway, I am looking forward further releases of Rope.

Economie sociale et ISR

Les acteurs de l’ économie sociale (coopératives, mutuelles et associations) seraient des champions de l’Investissement Socialement Responsable. Autre idée : la principale faiblesse de l’économie sociale est la difficulté à prendre des risques car les investisseurs ne sont pas rémunérés pour leur prise de risque ; d’où le besoin de sociétés de capital risque spécialisées dans les entreprises de l’économie sociale.

Innovation technologique au service du développement durable

Une technologie innovante n’est pas en soi favorable ou défavorable au développement durable. Par contre, le processus d’innovation peut l’être davantage. La plupart des technologies qui sont adoptées pour renforcer une démarche de développement durable dans l’entreprise sont des technologies dites “additives” : elles s’ajoutent à un procédé existant pour en limiter les effets néfastes sur l’environnement ou d’économiser la consommation de ressources (eau, énergie, …) par exemple. Cependant, les entreprises communiquant le plus sur le développement durable privilégient la promotion des technologies intégrées au cycle de vie de leurs procédés, i.e. intervenant dès l’amont, lors de la conception d’un produit.

Modèle économique de la GPL

Ce mémoire présente la dynamique de coopération qui a fait le succès du modèle open source incarné par la licence de distribution logicielle GPL. Comme l’indique ce mémoire, l’auteur d’un ouvrage distribué sous licence GPL s’interdit par le biais de cette licence de disposer d’une rente (retour sur investissement) lié au capital que représente l’ouvrage qu’il a créé. Le droit de la propriété intellectuelle, dont la finalité est l’accroissement de l’innovation intellectuelle, serait sensé établir un équilibre entre la motivation économique des auteurs et celle des consommateurs. Trop de protection des auteurs et ceux-ci disposeront de rentes élevées mais qui, par effet pervers, rendront l’innovation suivante plus difficile ou moins motivante économiquement. Trop peu de protection et les auteurs perdront leur motivation économique à innover. La licence GPL, s’inscrivant dans le cadre du droit de la propriété intellectuelle, implémente des règles de coopération qui pourraient permettre de retrouver l’équilibre nécessaire au développement de l’innovation.