Desgrange.net

Aller au contenu | Aller au menu | Aller à la recherche

lundi, juin 22 2009

Open Instant Messaging

Email protocols history

A long time ago there was several protocols to send a message from a computer to an other one. There was a protocol per network (the internet was not really born at that time). Those protocols were not compatible between each others. So for instance, if you were using FidoNet, you were not able to send emails to people using BITNET. Hopefully some people created some gateways to transfer emails from a network to an other one (but it looks like it was quite a nightmare). At some point in time, ARPANET and its email protocol became the standard and other protocols started vanishing. People were now able to communicate with each other easily.

Instant messaging

Instant messaging (IM) appeared much more recently (email started in the late 60s, late 80s for IM). Personal IM became very well known in 1996 with ICQ (I still have my ICQ account! (but nobody to talk to on it anymore)). Then several other protocols appeared:

Of course you can't chat with somebody using MSN Messenger if you use Yahoo! Messenger. ICQ was bought by AOL, AOL created AIM (AOL Instant Messenger) based on ICQ protocol. At some point people using ICQ were able to speak with people using AIM.

In 2004, XMPP, the protocol used by Jabber, became the official IETF instant messaging standard (IETF is the group defining the internet standards (like HTTP for the web, SMTP for emails, FTP for file transfer…)).

Jabber/XMPP

XMPP as been the standard for 5 years now, but still, most of my contacts are using MSN Messenger. Why? Because they also use a Microsoft Windows based computer and Microsoft MSN Messenger is the default instant messaging software installed on those computers.

XMPP allows creating gateways to other protocols (like MSN, Yahoo!, ICQ…). The situation looks a bit like the email status in the old days. But the email standard imposed itself as the killer application of ARPANET. So, what will be the thing that will wipe out all proprietary protocols and impose the XMPP open standard?

The problem with standards is that it take time to be developed. History as shown that new versions of a standard are not implemented by all software vendors quickly, nor they are deployed as fast as possible. So changes to the standard must not happen everyday and having it quite right at version 1.0 take time.

In the meantime, other protocols evolve faster because vendors have a captive market and a better control on how the software is distributed/used (and they don't have to wait for others to implement changes).

While XMPP was on the way to be standardized, other protocols got voice then video functionalities. I had a lot of hope in 1995 when Google released it's own IM software Google Talk. GTalk is based on XMPP and add some voice extension and video extension later.

Then AOL started an experiment to allow XMPP connections to its network, Yahoo announced that they were interested by XMPP too. The chat in Facebook uses XMPP (but the network is closed, you can only talk to Facebook users), several other community websites do the same.

Nowadays

Last week, the specifications for voice/video in XMPP were released. The biggest missing features making people stay with their proprietary IM is going to be old story soon. But I'm not sure it will be enough to see a big migration to XMPP.

Those last few years, XMPP interest increased a lot and nearly all IM vendors are now looking at XMPP… except Microsoft. It looks like you will soon have the choice between speaking to nearly everybody except MSN Messenger users or speaking only to MSN Messenger users.

Of course there are lots of softwares allowing you to use several accounts at the same time (so being connected to MSN, Yahoo! and Jabber at the same time). I also have several emails account. The difference is that from my professional email account I can send emails to everybody, same for my personal email. I'm choosing the email account I'm using depending on what my "role" is. If I want to send a message to a colleague, I will use my professional email address.

In IM, you can't do that, except if you open several accounts on each protocol you use. Having a professional and a personal account on MSN, on Yahoo!, on AIM, on Jabber… With all those protocols I currently have 9 accounts. And you know what? I like keeping things simple. For my email addresses I have started closing several accounts, keeping only the mandatory ones (my personal email address and the ones I have to use for my job). And I would really like doing the same for IM but you know what? Here I can't do what I want. Why? Because if I close my MSN account, I will lose contact with a lot of people.

I feel a bit like in jail. Worse, I feel like my friends are in jail too but they are saying "Where do you see a jail? There's only walls and fences".

Anyway, lots of people are using GMail now, and there is a chat embedded in GMail. Of course this chat uses Google Talk so it uses XMPP. Even if I don't really like GMail, I prefer having my friends using GMail/Google Talk than Hotmail/MSN Messenger (or whatever the name of those services are this week).

ejabberd

Since XMPP is an open protocol, anybody can implement it. There are several XMPP clients (Pidgin, Adium, Kopete, Trillian, iChat…) and there are also several servers.

In XMPP servers there is well-known one: ejabberd. This server is open source and written in Erlang. Ejabberd use the power of Erlang to be fault-tolerant, redundant, scalable, <add here any cool property a server should have>.

And since XMPP is a decentralized system, I can install my own server (as I did for my email server for instance).

Installing ejabberd on debian is as easy as usual:

$ sudo apt-get install ejabberd

To configure it, you just need to change the domain name to serve in /etc/ejabberd/ejabberd.cfg. If your domain name is example.org change the following:

%% Hostname
{hosts, ["example.org"]}.

And set the admin user:

%% Admin user
{acl, admin, {user, "admin_user_name", "example.org"}}.

Add a user with the following command:

$ sudo ejabberdctl register user_name example.org password

Restart the server. Done.

Of course there are a lot more parameters to change if you want to fine tune it. You may also need to create a SRV entry in your DNS if your server is not the one serving "example.org" (but "im.example.org" for example).

lundi, février 9 2009

iPhone + Nimbuzz + Freephonie

Il y a quelques temps, j'avais parlé de Fring, un logiciel disponible sur iPhone permettant de faire de la messagerie instantanée ainsi que de la VoIP.

J'avais essayé ce logiciel pour le côté VoIP avec la possibilité de l'utiliser avec mon compte SIP de chez Free. J'ai été déçu par le fait qu'il fallait tout d'abord se créer un compte chez Fring, et que tout le trafic (mes appels donc) passait par leurs serveurs.

Ce que je cherche c'est un client SIP tout bête. Je ne veux pas de service supplémentaire derrière et je ne veux encore moins que mes listes de contacts et appels téléphoniques passent par un intermédiaire supplémentaire, dans lequel je n'ai aucune confiance.

Aujourd'hui j'ai essayé Nimbuzz qui se présente fortement comme Fring (ce n'est donc pas de bonne augure).

Au premier lancement il demande de se créer un compte Nimbuzz :

login.png

L'interface graphique est plus sympa que celle de Fring (goûts personnels) :

contacts.png

Je vais pour configurer mon compte SIP… mais rien. Nimbuzz ne fait pas SIP, en tout cas pas la version iPhone. Il semblerait que cette fonctionnalité ne soit supportée que pour la version Symbian.

J'ai configuré un de mes comptes XMPP (Jabber), petit coup de tcpdump, toutes les données passent par les serveurs de Nimbuzz. Chose intéressante (et évidente), le protocole utilisé par l'application Nimbuzz est XMPP. C'est le standard de la messagerie instantanée, développé à l'origine pour/par Jabber, utilisé maintenant dans de nombreuses applications (Google Talk, Lotus Notes, Facebook Chat…).

Récemment Nimbuzz cherchait des développeurs Erlang, ce qui n'est pas étonnant quand on sait qu'un des serveurs XMPP les plus utilisés (ejabberd) est écrit en Erlang.

Bref, en ce qui me concerne, Nimbuzz n'a aucun intérêt. Pas de SIP, mes données passent par leurs serveurs alors que je veux une connexion "directe".

lundi, janvier 5 2009

Exceptions in Erlang

Exception

In a programming language, an exception is something that could be generated when the system is behaving outside the normal execution path. An exception is mostly an error. In lots of programming languages, developers use exceptions as a meaningful information to do something or not.

For example, while reading a file, an exception can be generated because the file does not exists. The developer may choose to catch the exception and display a popup to ask the user to choose an other file, or the developer may not catch the exception because the file must have been there and if it's not there it's because something wrong is happening but the developer has no clue about what to do, so the best solution is to let the system crash (as opposed to try to do something and maybe enter in an inconsistent state).

Exceptions in Erlang

In Erlang there are exceptions too.

-module(exceptions).
-compile([export_all]).

run() ->
    io:fwrite("Test exception 1 starting...~n"),
    exception1(),
    io:fwrite("Test exception 1 finished.~n"),
    ok.

exception1() ->
  erlang:foo().

Running the function run will run exception1 which throws an exception (the function foo does not exist in module erlang) and is not catched by run (so the second fwrite is not displayed):

$ erl -s exceptions run
Erlang (BEAM) emulator version 5.6.2 [source] [smp:2] [async-threads:0] [kernel-poll:false]

Test exception 1 starting...
{"init terminating in do_boot",{undef,[{erlang,foo,[]},{exceptions,run,0},{init,start_it,1},{init,start_em,1}]}}

Crash dump was written to: erl_crash.dump
init terminating in do_boot ()
Types of exceptions

In Erlang there are 3 kinds of exceptions that can be generated:

  • normal exceptions, user generated (throw(Reason))
  • errors, something is going really wrong, should not be catched (erlang:error(Reason))
  • exit, used to terminate current process (exit(Reason))
Catching an exception
  • Catching an exception with catch:
run() ->
    io:fwrite("Test exception 1 starting...~n"),
    Result = (catch erlang:foo()),
    io:fwrite("Test exception 1 finished: ~p~n", [Result]),
    ok.

Calling run displays:

Erlang (BEAM) emulator version 5.6.2 [source] [smp:2] [async-threads:0] [kernel-poll:false]

Test exception 1 starting...
Test exception 1 finished: {'EXIT',
                               {undef,
                                   [{erlang,foo,[]},
                                    {exceptions,run,0},
                                    {init,start_it,1},
                                    {init,start_em,1}]}}

…and the process is still alive.

  • Catching an exception with try … catch. The full syntax is something like this:
    try erlang:foo() of
	Any ->
	    Any
    catch
	error:Reason ->
	    io:fwrite("Error reason: ~p~n", [Reason]);
	throw:Reason ->
	    io:fwrite("Throw reason: ~p~n", [Reason]);
	exit:Reason ->
	    io:fwrite("Exit reason: ~p~n", [Reason])
    after
	io:fwrite("Doing some stuff no matter what happened.~n")
    end.

try executes the given function and return it's value (which can be pattern matched) if everything's OK, if an exception is generated it goes in the the matching catch clause. In any case, it then goes inside the after block (similar to finally in Java).

Running the previous code output:

Error reason: undef
Doing some stuff no matter what happened.

(Calling a function that does not exists throws an error)

As everything in Erlang, try … catch returns a value (the executed function, the return the matching clause or the return of the matching exception clause).

Examples

Let see more examples.

-module(exceptions).
-compile([export_all]).

run() ->
    run(1, no_exception, 'catch'),
    run(2, no_exception, 'try'),
    run(3, 'throw', 'catch'),
    run(4, 'throw', 'try'),
    run(5, 'exit', 'catch'),
    run(6, 'exit', 'try'),
    run(7, 'error', 'catch'),
    run(8, 'error', 'try'),
    ok.

run(ID, Exception_type, Handling_type) ->
    io:fwrite("~p) Generating ~p, handled with ~p.~n", [ID, Exception_type, Handling_type]),
    Fun = fun() -> exception(Exception_type) end,
    Result = execute(Handling_type, Fun),
    io:fwrite("~p) Result: ~p~n", [ID, Result]).

exception(no_exception) ->
    ok;
exception('throw') ->
    throw("Throwed exception");
exception('exit') ->
    exit("Exited");
exception('error') ->
    erlang:error("Error generated").

execute('catch', Fun) ->
    (catch Fun());
execute('try', Fun) ->
    try Fun()
    catch
	Error:Reason ->
	    {Error, Reason}
    end.

Output:

1) Generating no_exception, handled with 'catch'.
1) Result: ok
2) Generating no_exception, handled with 'try'.
2) Result: ok
3) Generating throw, handled with 'catch'.
3) Result: "Throwed exception"
4) Generating throw, handled with 'try'.
4) Result: {throw,"Throwed exception"}
5) Generating exit, handled with 'catch'.
5) Result: {'EXIT',"Exited"}
6) Generating exit, handled with 'try'.
6) Result: {exit,"Exited"}
7) Generating error, handled with 'catch'.
7) Result: {'EXIT',{"Error generated",
                    [{exceptions,exception,1},
                     {exceptions,execute,2},
                     {exceptions,run,3},
                     {exceptions,run,0},
                     {init,start_it,1},
                     {init,start_em,1}]}}
8) Generating error, handled with 'try'.
8) Result: {error,"Error generated"}

An interesting thing we can see here is that, in case of an error, catch get a stack trace which can be very useful for debugging but try … catch does not get it.

I prefer try … catch syntax (and it's the recommended way to catch exceptions because you can choose what kind of exceptions you want to catch, catch catches everything) but it's regrettable that it does not return the stack trace.

You can use erlang:get_stacktrace but it returns the stack trace from where you are calling it. If the exception is generated deep inside the function you are calling, get_stacktrace does not gives the root cause of the exception.

Having a stack trace is very useful but it make things a bit slower. I made a simple benchmark:

bench() ->
    Throw_fun1 = fun(_) -> (catch exception('throw')) end,
    Error_fun1 = fun(_) -> (catch exception('error')) end,
    Throw_fun2 = fun(_) -> try exception('throw') catch Error:Reason -> {Error, Reason} end end,
    Error_fun2 = fun(_) -> try exception('error') catch Error:Reason -> {Error, Reason} end end,
    Seq = lists:seq(1, 100000),
    timer:sleep(1000),
    {Time_throw1, _} = timer:tc(lists, foreach, [Throw_fun1, Seq]),
    {Time_error1, _} = timer:tc(lists, foreach, [Error_fun1, Seq]),
    {Time_throw2, _} = timer:tc(lists, foreach, [Throw_fun2, Seq]),
    {Time_error2, _} = timer:tc(lists, foreach, [Error_fun2, Seq]),
    io:fwrite("Throw (catch): ~p micro seconds~n", [Time_throw1]),
    io:fwrite("Error (catch): ~p micro seconds~n", [Time_error1]),
    io:fwrite("Throw (try): ~p micro seconds~n", [Time_throw2]),
    io:fwrite("Error (try): ~p micro seconds~n", [Time_error2]),
    ok.

Results (for 100 000 calls):

Throw (catch): 73920 micro seconds
Error (catch): 169576 micro seconds
Throw (try): 64118 micro seconds
Error (try): 63125 micro seconds

Throwing an error or an exception take the same amount of time but using catch on an error is 2.5 times slower than using try … catch (my quick conclusion on that is because catch generates a stack trace).

I would like having only one way of catching exceptions (no catch, only try … catch) and a way to specify if I want a stack trace or not in case of error. Maybe something like this:

try Fun()
catch
    Error:Reason:Stack ->
        {Error:Reason:Stack}
end.

If a catch clause is waiting for 3 elements (Error, Reason, Stack), the compiler add the necessary stuff to call the Fun with a stack trace. If there are only 2 elements (Error, Reason), keep the actual behavior.

Do we really need exceptions in Erlang?

Exceptions may be useful but we can achieve the same goal in Erlang in other ways. Lots of functions have a signature like: {ok, Value} | {error, Reason}.

Those functions have different outputs between the normal case and the exceptional case. Combined with case we get the same behavior as catching an exception. If we don't use case, we get a badmatch error.

case_way() ->
    Fun = fun(ok) -> {ok, "It works"};
	     (nok) -> {error, "It does not work"}
	  end,
    run_case(1, Fun, ok),
    run_case(2, Fun, nok),
    ok.

run_case(ID, Fun, Arg) ->
    case Fun(Arg) of
	{error, Reason} ->
	    io:fwrite("~p) Error: ~p~n", [ID, Reason]);
	{ok, Value} ->
	    io:fwrite("~p) Value: ~p~n", [ID, Value])
    end.

Running case_way:

1) Value: "It works"
2) Error: "It does not work"

Boons:

  • Way much faster than exceptions (around 4 times faster according to my quick bench)
  • No specific syntax

Banes:

  • Normal case needs some encapsulation like {ok, Value} instead or returning Value directly (not doing so may lead to unknown states if the return is not pattern matched for errors)
  • No convention (you can return {exit, Type, Reason} if you want, or anything you want, developer needs to read carefully documentation of each function before using it)

In Joe Armstrong's book, he says that usually people use {error, Reason} when an error occurs quite often and exceptions for less frequent errors. It makes sense, but I do not completely agree with Joe on that matter. When you are developing a software for yourself (or you firm), you may know how your code is used so you know if an error occurs often or not. But when you are developing software for other developers, you don't know how they are going to use it, it's more difficult to "predict" if the error is going to be thrown often or not.

So I tend to prefer throwing exceptions (maybe I have used to much of Java). I think the added syntax is necessary in order to keep thing clear, homogeneous and easy to use. I would love seeing a version of Erlang without functions having a {error, Reason} thing in their return signature (but something like throw(Reason)).

Complete source code associated with this post (you can do whatever you want with it): exceptions.erl

lundi, décembre 22 2008

Object Oriented Programming

Programming languages use different paradigms, nowadays the most used is the Object Oriented paradigm. Object Oriented Programming was an attempt to simplify the creation of softwares because they were more and more complex.

The idea was to model the software with objects (a car for example) with some attributes (it has wheels, a motor, etc.) and some methods (start, move forward, etc.). Objects alone are not enough, they need to be able to communicate with each other by sending messages.

I have worked for several years with Java, a well known OOP language, and I'm now working with Erlang, a functional language.

A functional language has functions as first class citizens. Functions should be stateless, they get some parameters and return a deterministic result. Seems not enough to design complex softwares.

An other feature of Erlang is the concurrent paradigm. It allows the language to do a lot of tasks in parallel. At first, most people think about this feature has a great tool to distribute calculation and/or use all the cores of a modern CPU with no burden. I thought the same thing. But this is not the aim of Erlang. Concurrent processes are used for software design. A process is a system, mostly representing a real life system, sending and receiving messages from other processes.

After using Erlang for a while (I'm not a very fast thinker, or maybe I just don't think too often ;-)), I realized that Erlang is the most Object Oriented Language I have ever seen.

It matches the definition of OOP more than any other language. Objects are processes (with an internal state) talking to each others by sending messages.

Why? Because Erlang allows you to have several thousands threads (objects) running simultaneously, where other languages allows you to have several thousand objects running in few concurrent threads (so if you have X threads, only X objects are running simultaneously). In Erlang objects are really independent from others, while in languages like C++, C# or Java, objects are executed sequentially.

Interestingly, if they share the same base idea, those languages need a completely different approach. One of the big differences is that talking between objects in Erlang is done by passing messages while it's done by methods calls in traditional OOP languages.

In those languages we do not deal very often with multiple threads and we tend to not have threads talking with each others because it needs locks and synchronization. Erlang deals with that very easily.

So in an OOP language like Java, when a thread dies, a lot of objects dies, maybe the whole software, that's may be not so bad (kind of fail fast approach). In Erlang, when a process dies, it's just an object, everything else is still running (when a car crash, the neighborhood does not disappear), but this can lead to some unknown state (imagine a firm where the boss dies and nobodies notice it (and are glad to not receive orders anymore ;-))). Erlang provides some tools to manage that. Processes can be linked together, when a process dies, it sends a death note (sorry ;-)) to processes it was linked to. When a process receives such a message, if it does not know how to handle it, it dies too (my wife is dead, I'm lost, I must commit suicide), but of course it may know what to do with the message, like restarting a process like the dead one (let's find a new wife). In Erlang's libraries there are such modules, called supervisors, specialized in re-spawning a process when it dies. It may looks like overhead, but I think it's very useful to create robust and fault-tolerant applications.

Anyway, I have the feeling that in the Erlang's community, there is a strong opinion against OOP. I think I will have understood how to design softwares in Erlang faster if people told me first that Erlang is true OOP and why and how. Moreover, it may bring more people to Erlang to develop the language and reduce the number of people asking for such and such OOP feature that are already in Erlang but not the way they are thinking about it.

Ralph Johnson has an interesting (and short enough) article about Erlang being the next Java.

Same thing was discussed on PlanetErlang.

lundi, décembre 1 2008

Erlang hot swapping

I started using the erlang programming language few months ago. Erlang is a wonderful language if you want to do reliable, distributed, highly concurrent, fault-tolerant, soft-real-time, highly available, hot swapping applications :-).

So, today I will give you a glimpse on one cool functionality of Erlang: the ability to update your code without stopping your application. Here it's just about how it works for a process (the principle is the same for an application except that you have much more things to take into account when you have several thousands process talking to each other).

THE CODE

We are going to use a simple piece of code. We want a server thread listening for messages (each message is printed on the console with a sequence number), and a client thread, sending a message to the server every second. Here is the code (I usually don't use comments but I put some here because you may not understand erlang code):

-module(code_reload). % This is the module declaration.

-export([start_server/0, start_client/1]). % Exporting functions allow them to be called from outside the module.
-export([server_loop/1, client_loop/1]).

start_server() ->
    spawn(?MODULE, server_loop, [0]). % This function calls the server_loop function in a new thread.

server_loop(Count) ->
    receive % Wait for messages
        {From, quit} ->
            io:fwrite("Received quit command from n", [From]),
            ok;
        {From, Message} ->
            io:fwrite("p received message p~n", [Count, self(), Message, From]), % Display the message we received.
            ?MODULE:server_loop(Count); % Call the same function again to wait for an other message.
    	_ ->
            throw(unexpected_message)
    end.

start_client(ServerPid) ->
    spawn(?MODULE, client_loop, [ServerPid]).

client_loop(ServerPid) ->
    receive
        {From, quit} ->
            io:fwrite("Received quit command from n", [From]),
            ok
    after 1000 -> % If no messages were received after 1 second, send a message to the server.
            ServerPid ! {self(), now()},
            ?MODULE:client_loop(ServerPid)
    end.

Let's start an erlang shell:

$ erl
Erlang (BEAM) emulator version 5.6.2 [source] [smp:2] [async-threads:0] [kernel-poll:false]

Eshell V5.6.2  (abort with ^G)
1> c(code_reload). % This compile the module code_reload.
{ok,code_reload}
2> ServerPid = code_reload:start_server(). % Start the server and assign the server process id to the variable ServerPid.
<0.37.0>
3> ClientPid = code_reload:start_client(ServerPid).
<0.39.0>
0: Server <0.65.0> received message {1211,11183,231032} from <0.67.0>
0: Server <0.65.0> received message {1211,11184,232031} from <0.67.0>
0: Server <0.65.0> received message {1211,11185,233029} from <0.67.0>
0: Server <0.65.0> received message {1211,11186,234031} from <0.67.0>
0: Server <0.65.0> received message {1211,11187,235030} from <0.67.0>

We can see the server printing the messages. But oups, there is a bug! I forgot to increment the counter so each line start with "0" instead of an incrementing number.

Let's change the following line in the code:

?MODULE:server_loop(Count + 1);

We need to go back to the shell and compile the new code:

4> c(code_reload).
{ok,code_reload}
0: Server <0.65.0> received message {1211,11188,236023} from <0.67.0>
0: Server <0.65.0> received message {1211,11189,237031} from <0.67.0>
1: Server <0.65.0> received message {1211,11190,238021} from <0.67.0>
2: Server <0.65.0> received message {1211,11191,239031} from <0.67.0>
3: Server <0.65.0> received message {1211,11192,240018} from <0.67.0>
4: Server <0.65.0> received message {1211,11193,241022} from <0.67.0>
5: Server <0.65.0> received message {1211,11194,242021} from <0.67.0>
6: Server <0.65.0> received message {1211,11195,243031} from <0.67.0>

Ah! It's better, the message number is growing as expected. Notice that the threads are still the same (<0.65.0> for the server and <0.67.0> for the client).

There is no black magic here. The main part of the code is a loop. The server function is server_loop and it calls itself with ?MODULE:server_loop(…). The process is running inside the old version of the code, and when we call server_loop again, the new call is send to the new version of the code.

Be careful, this happens only because we call the function that way ?MODULE:server_loop (code_reload:server_loop also works). Hot upgrade does not work if we call server_loop directly with going through the module again. This allows you to control where and when code reloading is done.

Now we can send a quit message to the client and the server:

5> ClientPid ! {self(), quit}.
Received quit command from <0.30.0>
{<0.30.0>,quit}
6> ServerPid ! {self(), quit}.                     
Received quit command from <0.76.0>
{<0.76.0>,quit}