Objectify Caml

Objective Caml (OCaml) is the principal implementation of the Computer programming language Caml, created by Xavier Leroy, Jerome Vouillon, Damien Doligez, Didier Rémy and their collaborators in 1996. This language, of the family of the languages ml, is a project Open source directed and maintained primarily by INRIA.

OCaml is the successor of Caml Light, to which he added inter alia a layer of programming object. Acronym CAML comes from Categorical Abstract Machine Language , but the recent versions of Caml gave up the abstract machine.

Principles

Caml is a functional Langage increased functionalities allowing the imperative Programmation. Objective Caml extends the possibilities of the language by allowing the directed Programmation object and the modular Programmation. For all these reasons, OCaml is included in the category of the languages multi-paradigm .

It integrates these various concepts in a system of the types inherited ml, characterized by a static Typage, extremely and inféré.

The system of the types allows an easy handling of structures of data complex: one can easily represent algebraic standard , i.e. hierarchical and potentially recursive types (lists, trees…), and to easily handle them using the Filtering by reason. That made of OCaml a language of choice in the fields asking for the handling of structures of data complex, for example the compilers.

The strong Typing, as well as the explicit absence of handling of the memory (presence of a table tidy) make of OCaml a very sure language. It is also famous for its performances, thanks to the presence of a compiler of native Code.

History

The Caml language was born from the meeting of the computer programming language ml, in which the team Formel was interested of the INRIA since the beginning of the years 1980, and of the abstract machine categorical CAMWOOD of Guy Cousineau, based on work of Pierre-Louis Curien in 1984. The first establishment, written by Ascander Suarez then maintained by Pierre Weiss and Michel Mauny, was published in 1987. The language differed little by little from his/her father ml because the team of the INRIA wanted to adapt a language to her own needs, and to continue to make it evolve/move, which entered in conflict with the " stabilité" ml imposed by the efforts of standardization of Standard ml.

The limitations of the CAMWOOD led to the creation of a new establishment, development by Xavier Leroy since 1990, under the name of Caml light. This establishment, whose recent version is still used in teaching nowadays, although the system is not maintained any more by the INRIA, functions thanks to an interpreter of code byte ( bytecode ) coded in C, which ensures a great portability to him. The management system of the memory, conceived by Damien Doligez, also made its appearance in Caml Light.

In 1995, Xavier Leroy publishes a version of Caml named Caml Special Light, which introduces a compiler of native code, and a system of modules inspired of the modules of Standard ml.

Objectify Caml, published for the first time in 1996, brings to Caml a system object designed by Didier Rémy and Jerome Vouillon. Certain advanced functionalities, like the polymorphic alternatives or the labels (allowing to give names to the arguments of the functions) were introduced in 2000 per Jacques Garrigue.

Objective Caml was relatively stabilized since (in spite of the absence of a specification, the document in force being the official handbook maintained by the INRIA). Many dialects of OCaml appeared, and continue to explore specific aspects of the computer programming languages (competition, parallelism, lazy evaluation, integration of the XML…) ; to see the section " Languages dérivés".

Main features

Functional language

OCaml has the majority of the common characteristics of the functional languages, in particular of the Fonctions of a higher nature and closings ( closures ), and a good support of the final Récursion.

Typing

The static typing of OCaml detects at the compile time a great error count of programming which could pose problems at the object time. However, contrary to the majority of the other languages, it is not necessary to specify the type of the variables which one uses. Indeed, Caml has a algorithm of synthesis of the types which enables him to determine the type of the variables starting from the context in which they are employed.

Thus, the declaration let division_entiere X there = X MOD there introduced a function, division_entiere, with two arguments, here x and y, which must both be whole, and the result of the evaluation are always an entirety. It is what OCaml expresses while answering automatically valley division_entiere: int - > int - > int =

The system of typing ml supports the parametric polymorphism , i.e. of the types of which parts will be unspecified at the time of the definition of the value. This functionality, automatic, make it possible to obtain results comparable with the Generics of Java or C# or with the Templates of C++.

Thus, the declaration let identity X = X introduced a function, identite, with an argument, here x. This argument can have any type. The result of the evaluation is of the same type as that of the argument. It is what OCaml expresses while answering automatically valley identity: 'has - > '= has

However, the extensions of typing ml required by the integration of advanced functionalities, like the directed programming object, complexes in certain cases the system of the types: the use of these functionalities can then ask for a time of training to the programmer, who is not inevitably familiar of the systems of the sophisticated types.

Filtering

The Filtrage by reason ( pattern matching ) is an essential component of the Caml language. It makes it possible to reduce the code thanks to a writing more flexible than of the traditional conditions, and exhaustiveness is the subject of a checking: the compiler proposes a counterexample when an incomplete filtering is detected. For example, the following code causes an error: state type = Active | Inactive | Unknown (* standard nap: a state is one of the three values: Credit, Inactive, Unknown *)

est_actif let = function | Credit - > true | Inactive - > false Warning P: this pattern-matching is not exhaustive. Young stag is year example off was worth that is not matched: Unknown

Modules

The modules make it possible to break up the program into a hierarchy of structures containing types and values logically connected to each other (for example, all the functions of handling of lists go in the List module). The descendants of family ml are the languages having currently the systems of modules the most improved, which allow, in addition to having spaces of names, implementing the abstraction (accessible values whose implementation is hidden) and the composability (values which can be construires over various modules, since they answer a given interface).

Thus, the three syntactic units of construction syntactic are the structures, the interfaces and the modules. The structures contain the implementation of the modules, the interfaces describe the values which of it are accessible (values whose implementation is not exposed are abstract values, and those which do not appear at all in the implementation of the module are inaccessible, following the example private methods in directed programming object). A module can have several interfaces (since they all are compatible with the types of the implementation), and several modules can check the same interface. The functors are structures parameterized by other structures; for example, the units (module Set) of the standard library OCaml are implemented as a functor which can take in parameter any structure implementing the interface made up of a type, and a function of comparison between the values of this type.

Directed object

OCaml is characterized particularly by its extension from typing ml towards a system object comparable with those used by the traditional object languages. That allows a structural under-typing , in which the objects are compatible types if the types of their methods are compatible, independently of their respective trees of heritage. This functionality, completely new in the statically typified languages (one can regard it as an equivalent of the duck typing of the dynamic languages) allows a natural integration of the concepts objects in an overall functional language.

Distribution

The OCaml distribution contains:
  • an interactive interpreter (ocaml)
  • a compiler bytecode (ocamlc) and the interpreter of bytecode (ocamlrun)
  • a native compiler (ocamlopt)
  • Of the generators of lexical analyzers (ocamllex) and syntactic (ocamlyacc),
  • a preprocessor (camlp4), which allows extensions or modifications of the syntax of the language
  • a debugger step by step, with flashback (ocamldebug)
  • Of the tools of profiling
  • a generator of documentation (ocamldoc)
  • an automatic manager of compilation (ocamlbuild), since OCaml 3.10,
  • a varied standard library

The OCaml tools are regularly used under Windows, Linux or Mac OS. The compiler Bytecode makes it possible to create files which are then interpreted by ocamlrun. The bytecode being independent of the platform, that ensures a great portability (ocamlrun being able a priori to be compiled on any platform supporting a compiler C functional). The native compiler produces a code assembler specific to the platform, which sacrifices the portability for largely improved performances. A native compiler is present for platforms IA32, PowerPC, AMD64, Alpha, Sparc, Mips, IA64, HPPA and StrongArm.

An interface of compatibility makes it possible to bind OCaml code to primitives in C, and the format of the floating tables of number is compatible with C and FORTRAN. OCaml allows also the integration of OCaml code in a program out of C, which makes it possible to distribute OCaml libraries to programmers out of C without them needing to know or to even install OCaml.

The OCaml tools are mainly coded in OCaml, except for some libraries and of the interpreter bytecode, who are coded out of C. In particular, the native compiler is entirely coded in OCaml.

Management of the memory

OCaml lays out, like Java, of a computer-assisted management of the memory , thanks to a incremental Ramasse-miettes générationnel. This one is especially adapted to a functional language (optimized for a fast rate/rhythm of allowance/release of small objects), thus does not have significant impact on the performances of the programs. There is configurable to remain effective in atypical situations of core use.

Performances

OCaml is distinguished from the majority of the languages developed in academic mediums by excellent performances. In addition to local optimizations " classiques" carried out by the generator of native code, the performances benefit advantageously from functional and strongly typified nature language.

Thus, information of typing is completely given with compilation, and do not need to be reproduced in the native code, which allows inter alia completely withdrawing the tests of typing at the object time. In addition, certain algorithms of the standard libraries exploit the interesting properties of the structures of pure functional data: thus, the algorithm of union of units is asymptotically faster than that of the imperative languages, because it uses their not-mutability to re-use part of the starting whole to constitute the whole of exit (it is the technique of path copying for the structures of data persistentes).

Historically, the functional languages were regarded as naturally slow by the programmers, but progress of the techniques of compilation made it possible to catch up with the initial advantage of the language requirements. OCaml, by optimizing effectively these parts of the language (boxing, closures…) and by implementing a GC adapted to the frequent allowances of the functional languages, was one of the first languages to show the found effectiveness of the functional programming.

In practice, the performances are in general slightly lower than that of an equivalent code out of C. Xavier Leroy speaks prudently about " performances from at least 50% those of a compiler C raisonnable". These forecasts for summer have confirmed by many benchmarks. In practice, the programs remain in general in this fork (from 1 to 2 times that of the code C), with extremes in the two directions (sometimes faster than C, sometimes strongly slowed down by an unhappy interaction with the GC). In all the cases, that remains faster than the majority of the recent languages which are not compiled nativement, like Python, Ruby or even the languages of platform .NET.

Use

The OCaml language, resulting from the mediums of research, does not profit from the advertizing power of certain current computer programming languages. There thus remains relatively little known " large public" data processing (as well as the majority of the functional languages), but is however firmly established in some niches in which qualities of the language counterbalance its relative lack of publicity and support.

Teaching

Nature multi-paradigm of OCaml and its light syntax make of it a language rich and appreciated by the teachers, who see a means there of initiating their students with various aspects of the programming within unified framework. In particular, Caml (OCaml or his/her little brother Caml Light) is the language used by the majority of the French preparatory classes, in the optics of the tests of data processing of the entrance examinations at the universities. It is also used in many universities, in France (for historical reasons) but also in the rest of the world, for example the United States or Japan. He suffers in the academic world from competition with his distant cousin Haskell, preferred in certain courses of functional programming because he does not take again any concept of the imperative programming.

Seek

OCaml is a language enough used in the medium of research. Historically, the languages of branch ml were always closely related to the field with the systems of formal evidences (the initial ml of Robin Milner thus appeared to be used in the system of evidence LCF). OCaml is the language used by one of the major software of the field, the assistant of evidence Coq.

OCaml is obviously present in many other fields of data-processing research, of which research in computer programming languages and compilers (see the section " languages dérivés"), or for the software of synchronization of file Unison.

Industry

In spite of its relatively timid communication, OCaml acquired a solid bases users in specific fields of industry. Thus, the aircraft industry uses OCaml for its safety of programming and its effectiveness for the formulation of complex algorithms. One can quote in this field the project Astrée (Static Analysis of Embarked software Time-Reality), used inter alia by the Airbus company.

OCaml is used by important actors of the software industry, like Microsoft, Intel or XenSource, all three members of the Caml Consortium. It finds also applications in the financial data processing, like showed it the company Jane Street Capital, which employs many OCaml programmers.

Lastly, it is also used by free projects general practitioners, like MLDonkey, GeneWeb, library FFTW, or even certain software of the environment of office KDE.

Presentation of the language

Basic declarations and values

The code can be entered simply following invites " #" who appears at the beginning of line. For example, to define a variable X container the result of calculation 1 + 2 * 3, one will write:
  1. let X = 1 + 2 * 3; ;
After having seized and having validated this expression, Caml determines the type of the expression (in fact, it is about an entirety) and posts the result of calculation: valley X: int = 7 One can be tried to carry out all kinds of calculations. However, guard should be taken not to mix the entireties and realities, which one usually does in many languages, because this causes an error during compilation: #2.3 + 1; ; Characters 0 - 3: 2.3 + 1. ; ; ^^ Standard This expression has float standard goal is young stag used with int This simple example makes it possible to have a first idea of the operation of the algorithm of synthesis of the types. Indeed, when we wrote " 2.3 + 1" , we added reality 2.3 and entirety 1, which poses problem. In fact, to carry out this calculation, we must make sure that all the numbers have the same type, on the one hand, and to employ the law of composition interns + applied to realities, noted " +." in Caml. We should thus have written:
  1. 2.3 +. 1.0; ;
-: float = 3.3

Functions

The programs are often structured in procedures and functions . The procedures are made up of a whole of orders used several times in the program, and gathered by convenience under the same name. A procedure does not return a value, this role being reserved for the functions . Many languages have distinct keywords to open new proceedings or a new function (" Procedure" and " Function" in a PASCAL, " Sub" and " Function" in Visual BASIC, etc…). Caml, as for him, has only functions, and those are defined same manner as the variables. For example, to define the identity, one can write:
  1. let id = function X - > X; ;
After seizure and validation of the expression, the algorithm of synthesis of the types determines the type of the function. However, in the example that we gave, nothing predicts of the type of X, also the function appears it as polymorphic (with any element of the unit 'has, it associates an image id (X) which is element of the unit 'a): valley id: 'has - > '= has

Recursivity

The recursivity consists in writing a function which refers to itself, on the model of the mathematical recurrence. In Caml, the recursive functions are introduced using the keyword rec. For example, to define the factorial, we can write: let rec factorial N = yew N = 0 then 1 else N * factorial (n-1); ;

Internal definitions

It is possible to define variables or functions inside a function. One uses for that following syntax:

  1. let factorial N =
let rec auxiliary result = function | 0 - > result | N - > auxiliary (N * result) (N - 1) in auxiliary 1 N; ; This writing is an example of final recursivity . The program is similar to a loop, and is regarded as such by the compiler (which will produce for example code ASM of a loop, thus equalizing the performances of the imperative code corresponding).

Handling of lists

The lists are very much used in programming, in particular for the recursive treatments. To build a list, several writings are possible:
  1. 1:: 2:: 3:: ; ;
  2. ; 2; 3; ;
By doing this, a list of entireties is obtained, that Caml notes in the following way: -: int list =; 2; 3 To know the length of a list without using the function of the List module defined for this purpose, one can write: (* Length of a list *)
  1. let rec length = function
| - > 0 | T:: Q - > 1 + length Q; ; During the analysis of this function by the algorithm of synthesis of the type, it appears that the list can contain any type of data, so that the function has the following type: valley length: 'list - > int has =

Functions of a higher nature

The functions of a higher nature are functions which take one or more functions in entry and/or return a function. The majority of the functional languages have functions of a higher nature. Concerning Caml, one can find of them examples in the preset functions of modules Array, List, etc. For example, the following expression:
  1. List.map (fun I - > I * I) 1; 2; 3; 4; 5; ;
will produce the following result: -: int list = 1; 4; 9; 16; 25 The function map takes in argument the anonymous function which, with entire I, associates its square, and applies it to the elements of the list, thus building the list of the values raised squared.

Recursive trees and types

To define a binary tree of unspecified type, one makes use of a recursive standard . One can thus have recourse to the following writing: type 'has tree = Feuille | Connect off ('tree has * 'A * 'has tree); ; This tree is composed of branches which ramify with wish, and end in sheets. To know the height of a tree, one uses then:

let rec height = function | Break into leaf - > 0 | Connect (connects, _, branche') - > 1 + max (height connects) (height branche'); ;

Search for root per dichotomy

let rec dicho F min max eps = let fmin = F min and fmax = F max in yew fmin *. fmax > 0. then failwith " No racine" else yew max -. min < eps then (min, max) (* an interval turns over *) else let millet = (min +. max)/. 2. in yew (F millet) *. fmin < 0. then dicho F min millet eps else dicho F millet max eps; ;

(* Approximation of the square root of 2 *) # dicho (fun X - > X *. X -. 2.) 0. 10. 0.000000001; ; -: float * float = (1.4142135618, 1.41421356238)

Criticisms

OCaml suffers somewhat from its lack of popularity. The community is not extremely active, which can obstruct when one seeks for example a binding in OCaml of a library coded in another language: one will find it less often than in other popular speeches like Python, or it is likely to be little (maintained even at all). The discretion of the community is not that a problem of size, because one can observe that the Haskell community is much more visible, although the language is not used so much more as a whole.

One often reproaches OCaml the absence of ad hoc polymorphism, in particular on the level of the overload of the operators: one does not appreciate to have to use two different operators, (+) and (+.), to have to add with the entireties or the floating ones.

Lastly, in comparison with the other widespread functional languages like SML or Haskell, OCaml suffers from the absence of multiple implementations: the only implementation supplements to date is that maintained by the team of the INRIA, which is closed perhaps a little as for the external contributions (it seems for example that it is relatively difficult to propose improvements of the standard library, as tries to do it the ExtLib project). There recently exists a project aiming at compiling OCaml towards the machine Java, Ocaml-Java, which seems to support the whole of the language, but it is still with an early training course of the development, and is an extension of official the OCaml compiler, rather than a completely different implementation.

Derived languages

Many languages extend OCaml to implement over functionalities a little more exotic.
  • F# is a language of platform .NET developed by Microsoft Research, based on OCaml (and partly compatible)
  • MetaOCaml adds a mechanism of quotations and of generation of code to the runtime, which brings functionalities of Métaprogrammation to OCaml
  • Fresh OCaml (based on AlphaCaml, another derivative of OCaml) facilitates the handling of reference symbols
  • JoCaml adds to OCaml a support of Join Calculus, directed towards the competitor or distributed programs
  • OcamlP3L brings a particular form of parallelism, based on the " squelettes" ( skeleton programming )
  • GCaml adds ad hoc polymorphism to OCaml, allowing the overload of the operators or a marshalling storing the information of typing
  • OCamlDuce allows the system of the types to represent values XML or related to regular expressions. It is an intermediary between OCaml and the CDuce language, specialized in the handling of the XML

See too

External bonds

  • official OCaml site of the INRIA
  • In connection with objective Caml
  • a history of Caml
  • the Bump (The Hump), repertory of projects related to OCaml (derived libraries, bindings, software using OCaml, languages…)
  • CocanWiki a wiki dedicated to the Ocaml language and its industrial applications.
  • '' Développement of applications with Objective Caml '', delivers O' Reilly on line (published in 2002 but remains very complete).
  • course of programming in Caml for beginners
  • off has brief history Caml (ace I remember it)
  • a French forum on Caml/Ocaml
  • Développer a Web site in OCaml with the framework Ocsigen

References

Random links:Santranges-sancerre | Expatriate | Świętopełk II of Poméranie | Isabella Colbran | Eucalyptus angustissima