Introduction
- Ever wonder what’s inside the PHP. How every language construct, function works.
- How optional parameters handled. Why certain functions behave weird and want know their inner working.
- That’s why I bring this tutorial to how easily navigate to the function declaration and understand it easily.
- Since PHP is opensource you can see all source code easily in github (clone of official git repository) but quickly finding the desired function in github is cumbersome but don’t worry we will find some alternative solution in this tutorial
- The PHP compiler/interpreter which is written in C and C++ which uses Lexical analyser, Yacc (Yet another compiler compiler), configurator to identify computer hardware and operating system finally virtual machine such as zend virtual machine shortly ZVM which runs our code
Lets Explore Directory Structure and Pattern
- Most of the PHP source which is in C language looks very similar to PHP so understanding the code won’t be difficult.
- If the official documentation is more abstract you can get this procedure as a handy tool to understand the function very well.
- To explore the PHP source we are going to use Adam Harvey‘s PHP source browser. Lexer link: https://php-lxr.adamharvey.name/source/
- The PHP lexer’s home page have a search form in the left and source selection box in the right for simplification going to select php-7.3 in selection box.
- In selection box please double click the php-7.3 to goto the source listing.
- In the directory listing you will find some important directory like
ext
which contains core functions with the function’s name as sub-directories name,main
comprises memory allocation code, directory scanning code, etc.,Zend
is the zend engine code which contains compiler, language features and VM (Virtual Machine). - As you explored the ext and Zend directory it’s clear that in official documentation items which are found under “Language Reference” are related to zend directory and “Function Reference” are related to ext directory.
- Let’s we go into the Zend directory’s file
zend_builtin_functions.c
please click here which will redirect you to function lists in that c file. - As you note down the code, you will see list of very familiar functions like
zend_version
,func_num_args
,strlen
,property_exists
,class_exists
, etc., - To dive deep into the internal code understanding process we chosen the string function explode.
- Here is the explode documentation link: explode()
Explode() overview
- The function
explode
supports PHP 4, 5, 7 (Newbie Hint: There is no PHP 6 because core team planned to release PHP 6 long back with unicode support but not released. So to avoid confusion and marketing advantage PHP 7 is released after PHP 5). - Explode function has two required parameters and one optional parameter totally three parameters.
- It return either boolean false or Array.
- From version 5.1.0 negative limit also supported.
- For more details about this function please review official document which is linked in last point of the previous section.
- Let’s we move to the internal working of this function.
Search using LXR utility
- Go to the lexer home page. Link: https://php-lxr.adamharvey.name/source/
- In HTML form’s first field (Field Label: Full Search) type
explode
and click the buttonSearch
. - It will show many matched results to narrow down we have to change the search term to
"PHP_FUNCTION explode"
. Note: must enclose search term with double quotes. - The search result lists the two files php_string.h and string.c
- The file php_string.h is similar to interface/ abstract class for a complete class. The file extension h denotes it’s a header file for main c file.
- Lets move on to our main objective file string.c by clicking the line number in the search result it will redirect us to the explode function’s declaration. Link: Line no: 1155.
Explode() internal function call chain
- Explode function declaration starts with
PHP_FUNCTION(explode)
on line 1155. - The declaration is enclosed inside between the markers
{{{
and}}}
. Note: Scroll up to see the marker. - We can easily identify where the function declaration starts and ends using this markers.
- Here the
PHP_FUNCTION
is not a c inbuilt syntax it’s a C macro. - The macro always start with
#define
- So we will search in the lxr home page with search term
"#define PHP_FUNCTION"
or directly click here - Once searched, the search list have the file name
php.h
in that click the line number 409 - In redirected source code page you will find that
#define PHP_FUNCTION
is points to other macroZEND_FUNCTION
- Even though the hyperlink on
ZEND_FUNCTION
redirects you to the search form with filtered list it shows more records - We will use alternative search term to filter more accurately for our need
- Once we used
"#define ZEND_FUNCTION"
as search term the utility page will list a file:zend_API.h
in that list click the line number 64 or click here to redirect directly - click here to redirect to lxr utility with search term
- The
zend_API.h
file’s line number 64 contains the following code#define ZEND_FUNCTION(name) ZEND_NAMED_FUNCTION(ZEND_FN(name))
- As you noted the 64th line have another two macros
ZEND_NAMED_FUNCTION
andZEND_FN
- Lets we see inner function
ZEND_FN
- In lxr search form add this search term
"#define ZEND_FN"
in full search field and click the search button or click here to go directly - In result list you will find a file name
zend_API.h
with line no 61, please click the line number or click here to redirect - Which have the following code
#define ZEND_FN(name) zif_##name
(in C the operator ## is for concatenation and it is called as token concatenation operator so it will return the concated stringzif_explode
) - The
ZEND_FN
is passed as argument to the macro functionZEND_NAMED_FUNCTION
- In lxr search form add this search term
"#define ZEND_NAMED_FUNCTION"
in full search field and click the search button or click here to go directly - It contains the following code
#define ZEND_NAMED_FUNCTION(name) void ZEND_FASTCALL name(INTERNAL_FUNCTION_PARAMETERS)
- The function
ZEND_FASTCALL
will be interpreted asvoid ZEND_FASTCALL zif_explode(INTERNAL_FUNCTION_PARAMETERS)
- If we search for
"#define ZEND_FASTCALL"
it shows a list of search results in that click the first one’s line number or click here to go directly - After redirected scroll up little and you will see the full block of C preprocessor conditional block
- This block is set of conditional call to choose the type of compiler as we know for windows, linux and other supported OS the compiler will be different
- Lets we arrange functions chain as a sequence block for easy mind map have a look on it
PHP_FUNCTION(explode)
→ZEND_FUNCTION(explode)
→ZEND_NAMED_FUNCTION(ZEND_FN(explode))
→ZEND_NAMED_FUNCTION((zif_##explode))
- Readers thanks for your patience to follow the flow
- The reason to see the above steps before your eagerly expected internal flow of
explode
or name any core function is to understand that the PHP execute the core function in some better optimised way than the standard userland functions - So our explode function starts the journey from
PHP_FUNCTION(explode)
to optimised callzif_explode
Explode() function definition in C language
- As in previous section the explode code starts at the line number 1153 with the markup
{{{
and ends at the line number 1192 with the markup}}}
- Line 1155 : is the macro call and that call itself has a separate section which is explained in the previous section
- The lines 1157 – 1159: are the C local variables and some of the variables are used for arguments we passed in
explode
function’s parameters - The lines 1161 – 1166: are used to copy the runtime input parameters to the local variables
- Line 1161: is a macro call
ZEND_PARSE_PARAMETERS_START(2, 3)
this function is used to initiate the process of copying runtime value into the local variable which defined just above this macro call - This function has two parameters the first one for number of mandatory parameters and the second one is total number of parameters
- Line 1162: is macro function
Z_PARAM_STR(delim)
which accepts string value - This macro copy the delimiter string value passed as first parameter in the
explode
function to the local variabledelim
- Line 1163:
Z_PARAM_STR(str)
copy the string value passed as second parameter in theexplode
function to the local variablestr
- Line 1164:
Z_PARAM_OPTIONAL
is used indicate from here to end of parse parameter block all the variables are optional one - Line 1165:
Z_PARAM_LONG(limit)
is a function used to copy the PHP’s int datatype value to the C’s long datatype variable explode
function’s third parameter$limit
which is optional whose value is copied to the C variablelimit
- If not passed then
ZEND_LONG_MAX
macro’s constant will be set, it’s based on 32 bit or 64 bit system the value’s size may vary - Line 1166:
ZEND_PARSE_PARAMETERS_END();
denotes the end of the parameters parsing and copying to local C variable and this is end of this block, note in this block this line only ends with semicolon to denote the block’s completion
- Line 1161: is a macro call
- The lines 1168 – 1171 is a block which check
delim
is a empty string by checking the length and throws error- Line 1168:
if (ZSTR_LEN(delim) == 0)
uses the macro functionZSTR_LEN
which counts the character in the string and returns it’s length which compares with integer 0 - If boolean comparison is true, then code execution move into the block else if false then it skips the block completely
- Line 1169:
php_error_docref(NULL, E_WARNING, "Empty delimiter");
is a error throwing macro function which throw warning error - The first parameter of function
php_error_docref
is char datatype with NULL as a value and the remaining two parameters are self explanatory - Line 1170: is a macro which compare it to constant macro value 2 which is for zend false type
IS_FALSE
you can find it by going upto the beginning of this macro’s call by clicking and follow the hyperlink fromRETURN_FALSE
which redirects it toZVAL_FALSE
and go on
- Line 1168:
- Line 1173:
array_init(return_value)
is an array API function which initialize a hash table for an array, additional info: internally PHP array itself a hash map (If possible in future will put a tutorial on Array’s internal!) - The lines 1175 – 1181 is a block which is used to free up
zval
when the source string which need to be explode but passed as empty string and execute return to return result from thePHP_FUNCTION
macro function- Line 1175:
if (ZSTR_LEN(str) == 0)
checks if the passed string (to explode) length is equal to zero - Line 1176:
if (limit >= 0)
checks the optional third parameter of the explode function (in C it’s local variable is limit) is greater than or equal to zero - If limit variable’s boolean comparison is true then calls the inner block
- Line 1177:
ZVAL_EMPTY_STRING(&tmp);
which empties the tmp zval - Line 1178:
zend_hash_index_add_new(Z_ARRVAL_P(return_value), 0, &tmp);
first usingZ_ARRVAL_P
is used to fetching the value of array type and passed it to the functionzend_hash_index_add_new
which is hash table API wrapper around the function_zend_hash_index_add_or_update_i
for reference document doxygen document - 1180: is a simple return to return empty array
- Line 1175:
- The lines 1183 – 1190 is a decision logic block based on the limit value it will split the string from left to right upto the limit lastly it have the whole left out string in last index or right to left upto the limit and remaining string will be neglected and the final else part is to handle limit of 0 or 1 which return the complete string as an array with whole string in first index
- Line 1183:
if (limit > 1)
which is straightforward boolean logic which checks is local variablelimit
is greater than integer 1 - Line 1184:
php_explode(delim, str, return_value, limit);
is thePHPAPI
macro call with all parameters easy to understand except the third onereturn_value
which iszval*
provided by the macro functionPHP_FUNCTION
- Line 1185:
else if (limit < 0)
which is straightforward boolean logic which checks is local variablelimit
is lesser than integer 0 i.e. negative number - Line 1186:
php_explode_negative_limit(delim, str, return_value, limit);
is thePHPAPI
macro call with all parameters same asphp_explode
but this API is for negative limit - Line 1187: is the last block of this if-else ladder
- Line 1188:
ZVAL_STR_COPY(&tmp, str);
is a macro which is used to copy the string in the local variablestr
tozval
generic pointer type local variabletmp
- Line 1189: is similar to the point 7.5 of this section
- Line 1183:
Conclusion
- Now you are ready to explore any PHP function easily by applying the same procedure we discussed
- You learned the directory structure of the PHP source
- You learned how to use lxr utility
- You learned how
PHP_FUNCTION
macro is working