The aim of the Phalanger project is to create a module enabling execution of PHP scripts on the Microsoft .NET platform. This module is cooperating with the ASP.NET technology enabling it to generate web-pages written in PHP the same way ASP.NET pages are.
In contrast to the original PHP interpreter, our module is compiling the scripts into the MSIL (Microsoft Intermediate Language). While the first page access is delayed due to compilation, all following accesses benefit of the fast execution of the native code which is always far more effective than script interpretation. This gain is most important in case of common PHP-script libraries (for example PHPLIB, PEAR, Nuke etc.) which are complex yet immutable and thus their script files are compiled only once. On the other hand there may be some code in the PHP scripts which is not known in the time of compilation and has to be compiled at run-time. The .NET platform fully supports run-time code generation. Although this procedure handicaps script compilation a little, it is known that PHP constructs imposing run-time compilation are used rather rarely and if, there is not much code to deal with.
The principal goal of our project is to enable full functionality of existing PHP scripts without any modification. The only condition is that these scripts do rely neither on special functionality provided by the UNIX platform or the Apache server nor on undocumented, obsoleted or even faulty functionality of the PHP interpreter. The built-in PHP functions provided to programmer by the PHP interpreter (for example string and array manipulation etc.) have to be reimplemented to reach this goal. They are rewritten in the managed .NET environment, namely in C#, and thus many of these functions will be reusable by other .NET application programmers.
In addition to the built-in ones, there are external functions which are dynamically linked to PHP. These provide additional functionality to PHP programmers such as database access, image manipulation or data compression etc. Since there is a huge amount of such libraries and there will be even more of them in future, it is impossible to reimplement them all in C#. That's why there is a component in our project enabling the external functions to be called from the .NET platform and thus used not only from the PHP scripts but also other .NET languages such are C# or VB.NET.
In addition to the PHP compiler, our project contains a component integrating PHP script editing into the Visual Studio .NET environment, version 2003.
The Phalanger compiler will compile the PHP language into the MSIL byte-code. It is thus the front-end of the compilation while the back-end is provided by the JITter (Just-In-Time compiler) which is a part of the .NET framework. That's why we do not address the native code generation nor optimization. Our task is to compile the PHP scripts into .NET assemblies - logical units containing MSIL code and meta-data.
To be able to reuse compiled code as much as possible, it is necessary to compile individual PHP scripts separately. This is in contrast to the behavior of PHP interpreter: when a PHP script references another (for example using the include construct), the interpreter performs actions very similar to actual source code copy-paste inclusion at that place. This means that code sharing involves lot of code reinterpretation unless it is cached and optimized by some additional third party tool. That's why the Phalanger compiler compiles individual files to unilaterally independent modules (an included script is not dependent on the including one) to enable linking of the resulting modules without the need of recompilation. This approach requires careful treatment of the compiled form of scripts which may or may not enable such optimization.
The PHP language contains some constructs having interpreted character. These constructs have to be handled in compile-time so that the resulting code would do the same but effectively. Some constructs may be implemented using parts of compile-time symbol-tables emitted into the resulting code; some require execution of code which is unknown in the compile time (e.g. the eval construct). The latter ones require the execution of a compiler to generate and run the MSIL code at run-time (run-time code generation is supported by the Framework thanks to the JIT compilation, although there are some drawbacks which we also discuss in this paper. It is important to notice that an experienced programmer does not need to use such constructs very often because in most cases it is possible to reach the same effect using a "cleaner" technique. It is thus not necessary to compile such constructs more effectively than other ones.
PHP contains several hundreds of functions available to PHP script programmers. These may be divided into these groups:
Language constructs are implemented by the compiler including those having a function-call syntax but treating its arguments in a special manner (such as array, list etc.). There are no corresponding functions in the Phalanger Class Library. Built-in functions are implemented in the Phalanger Class Library, which is written entirely in C# and providing this functionality via classes (for example PHP.PhpStrings, PHP.PhpArrays etc.) and their static methods. The library is designed to offer its functionality not only to the compiled PHP code but also encourage reuse in any other .NET Framework application (written in C#, J#, VB.NET etc.). Moreover, respecting some simple rules, a programmer can extend the Class Library with its own functions written in an arbitrary .NET language.
External functions are implemented in the PHP as dynamically linked libraries (.dll on Microsoft Windows platform). These libraries are loaded to the PHP interpreter address space and communicate with PHP using a predefined set of functions (called Zend API). Typical PHP installation contains about 50 such libraries and the provided functions are numerous. It is possible to implement any of these libraries in C# and add it to the Class Library (as stated above) but it is impossible to implement all of them because the number of such libraries is not limited by the PHP distribution. That's why we decided to use the existing .dll libraries and create a component enabling to call the contained functions directly from .NET and thus both C# and the compiled PHP code.
There are two means of loading PHP extensions: collocated or isolated. The web server administrator may configure individual extensions depending on their reliability preferring either performance or safety:
The isolated dynamic libraries are loaded into the address space of the ExtManager programmed in MC++ (C++ with managed extensions) which communicates with the ASP.NET server using the .NET Remoting. The communication is held via shared memory managed by our remoting channel called ShmChannel. This channel is a part of the Phalanger project but may be used independently in any other .NET application as a faster alternative to the TcpChannel and HttpChannel shipped with .NET Framework. The ExtManager simulates the PHP interpreter environment to the hosted libraries so that their functionality is the same as in PHP. Since the extension libraries contain an unmanaged (native) code they may cause exceptions leading to the termination of the calling process. ASP.NET server then executes a new process of ExtManager without threatening the server process itself in such case.
The server administrator installing Phalanger and deciding which libraries are available for the page developers will have the tools necessary to install any library implementing the interface designed by the PHP authors. This installation is performed by a standalone application shipped with Phalanger. A code encapsulating the given library is generated. We call it a managed wrapper of the extension. It gives access to the functions contained in the extension from the managed environment (for example C#) via the ExtManager. This wrapper contains definitions (stubs) of the encapsulated functions together with the type information. Since there is no such information (regarding function arguments or return value types) in the dynamic libraries, it has to be provided by user in XML format. These type-information files form a part of the Phalanger installation for the most commonly used extensions together with the compiled wrappers for these extensions.
The cooperation is enabled by an object implementing an interface for responding requests sent by clients to a web-server (a.k.a. HTTP Handler). A request is received by the IIS (Internet Information Server) which should be configured to pass .php page requests to the ASP.NET server. This server (process) is a host of .NET Framework applications. There are several logical spaces in its address space called AppDomains each created to serve requests for one web application (not only Phalanger ones). When a request to a Phalanger application comes to the server an object called request handler is created and starts to process the request. At first it checks for the presence of a cached compilation of the requested script. If the compiled code is not found a compiler is executed to create it and store it into the cache. The compiled code is then instantiated and executed within a script context which is also created per request. It contains run-time modifiable configuration and other stuff valid while the request is processing.
ASP.NET applications are configured using XML configuration files. The main file (Machine.config) is stored in the .NET Framework directory. The individual web-applications are configured using Web.config files stored in the respective directories with the configuration being inherited through the directory hierarchy. This is used also in the Phalanger configuration. In the typical installation, there is a Phalanger virtual directory in the root directory of the web-server. The main configuration file (with options similar to the php.ini file) will be the Web.config in the Phalanger directory. This file is managed by the web-server administrator who will create subdirectories for individual web-applications enabling them to reconfigure some settings using the corresponding Web.config files.
The VS.NET from version 2003 supports the integration of additional languages into the editor using the VSIP (Visual Studio Integration Program). The documentation of this interface is available to the academic community for free and enables the implementation of code highlighting, IntelliSense, project types and some more features and components of the VS.NET useful for comfortable usage of specific programming languages.
Our integration introduces the PHP language into the VS.NET in a specific project type supporting syntax highlighting and compilation. PHP files are compiled to .NET Framework executables and can be executed and even traced from the VS.NET environment using the generated debug information. To make script development and debugging more comfortable, the IntelliSense support for PHP projects and direct PHP variable insight may be added in future versions but this is not a priority for the current project. The VS.NET integration is rather some kind of add-on to the Phalanger.
The Phalanger is a Microsoft .NET Framework application. It requires the Microsoft .NET Framework 1.1 which and Microsoft Windows 2000/XP/2003 operating system.
Integration with ASP.NET is the principal feature of Phalanger. This requires Internet Information Services (IIS) version 5 or 6 with ASP.NET installed. The ASP.NET is installed automatically for IIS-enabled systems with the .NET Framework setup.
To benefit from the additional Visual Studio integration feature a Microsoft Visual Studio 2003 with Visual Studio Integration Partners (VSIP) package has to be installed.