Programming With Sip -- Some Examples

Introduction

This is by no means an authoritative discussion about SIP. Rather, it is a chronicle of the adventures and misadventures of a bumbling newbie trying to learn to use a great tool with little documentation. Some references that are essential in conjunction to this include:

PyQt: an implementation of Python bindings for Qt. Reading the Sip files for these classes is instructive.
Sip documentation. This is by no means complete, but it is currently the best available
Python documentation: Python ships with documentation, which is also browsable online. Functions that are used to access python objects in C are well documented in the API documentation. Another source of documentation is the Extending and embedding guide. This has more of a tutorial style.

For the convenience of the reader, I've included a concise summary of the sip and python API items used here.

Downloading the Examples

The string example and fraction example may be downloaded.

A Note About Versions

sip has changed from version to version. I'm using python 2.2 and sip 3.2.4. Depending on the verion you're using, sip will behave a little differently. The most notable addition to the new version of sip is support for classes that are wrapped inside namespaces.

Why I'm Interested in Sip

I'm currently working on C++ software, and want to create an interface to my libraries in a more user-friendly programming language. I make use of Qt, so the fact that Sip is committed to supporting Qt is a big plus from my point of view.

An Example: C++ Strings in Python

Here we present a simple example of sip: an implementation of the C++ string class. There are two main aspects to implementing the bindings: writing the .sip file that the sip preprocesses uses to generate bindings, and automating the build process. First, let's create a sip file. Creating this file is pretty simple. The sip file itself is here . First, we need to declare the name of our python module:

%Module

Then we include the declarations given in the header file:

%HeaderCode
#include <string.h>
%End

Then we declare the interface. This is largely uneventful.

namespace std
{
	class string
	{
	public:
		string();
		string(const char*);
		string(const std::string&);

		bool empty();
		int length();
		int size();
    /* blah blah blah */

The only comments worth making here are:

The name std::string needs to be qualified inside the declaration.
I can't find clean and portable way to get std::string::size_type to work. So I'm just using int.
It's not necessary to re-declare all methods, only those you want available in python.

One could leave it at that and just declare those methods. However, there are some other things one may want a string class to do ...

Operator Overloading

Array-like element access (like the C++ operator[])
Implicit conversion to a python string.

We need to implement these explicitly. More general and terse documentation (but unfortunately, somewhat dated) about operator overloading in SIP can be found in the Sip documentation. Here, we will maintain a tutorial style as opposed to a comprehensive one. First, we implement the string conversion method, __str__. The parameters for all python operators are in the python tuple sipArgs

First, let's implement __str__. We start with some code to declare the method:

void __str__ () /NonLazy/;
%MemberCode

First, we declare some variables. ptr is a pointer to the underlying C++ object, and s is a temp to hold the return value of the c_str() method that we use to extract a C-style string.

const char* s;
std::string* ptr;

Note that we don't have to declare the parameters, because sip declares them for us when it processes the file. Also note that it doesn't matter what return type we declare -- the python function simply returns whatever we return in the MemberCode. The parameters are given in sipArgs. The variable sipArgsParsed is also declared in the generated code. We need to unpack the function argument first. sip offers the sipParseArgs() function for it. The format is

int sipParseArgs(int* sipArgsParsed, PyObject* sipArgs, const char* format, ... )

where sipArgsParsed returns the number of arguments parsed, sipArgs is the argument tuple, format is a scanf-like string that tells the function what types of arguments to look for, and the other arguments are dependent on format. In this example, we use "J1" for the format. This tells sipParseArgs to expect an object of a sip class type, and expects two arguments: an argument of type PyObject* which should be the class object expected (in Python, each class is an object, so for each sip wrapped object, there is a corresponding type object named sipClass_classname), and a pointer-to-pointer to the corresponding C++ type. Brief summary of the specifiers:

format	operand type	types expected	description
O	And Python object	PyObject**	pointer to store result
T	Python object of a given type	PyTypeObject* , PyObject**	the type object corresponding to the desired type, and a pointer to store the result in
J1	Wrapped sip object	PyObject, type*	the sip class object (sipClass_classname), and a pointer-to-pointer to store the result (a pointer to the dynamically allocated object in question)
i	integer	int *	Pointer to an int to store result in
h	short integer	short*	Pointer to a short int to store result in
l	long integer	long*	Pointer to a long to store result in
d	double	double*	Pointer to a double to store result in
f	float	float*	Pointer to a float to store result in
m	the object a method is invoked on	PyObject* , PyObject, void*	the pointer sipThisObj (instance pointer), the class (sipClass_classname), and a pointer-to-pointer to a C++ object of the right type to store the result.

This returns a value of true if the call was succesful. So we add this code:

        if (sipParseArgs(&sipArgsParsed,sipArgs,"J1",sipClass_std_string, &ptr))
        {
                // do stuff
        }

Having obtained ptr, we obtain the char* pointer from that, using c_str(), and convert to a python string using the PyString_FromString function in the python API. See section 7.3.1 the python API reference, which comes with the python distribution for more information about this function.

        if (sipParseArgs(&sipArgsParsed,sipArgs,"J1",sipClass_std_string, &ptr))
        {
	        s = ptr->c_str();
	        /* Python API reference, P40 */
	        return PyString_FromString(s);
        }

Next, we'd like to implement [] notation. This is done by implementing the __getitem__ method. Start by declaring the method:

void __getitem__ /NonLazy/;
%MemberCode

Now for the C++ member code. First, we declare and extract the this pointer as per the previous example. Then we need to check that the requested index is not out of bounds it is important to do this, uncaught C++ exceptions will cause python to abort !. So it's necessary to either prevent C++ exceptions from occuring, or to trap them and propogate python exceptions.

if (a1 >= ptr->length())
{
/* Python API Reference, Ch 4  */
	PyErr_SetString ( PyExc_IndexError  ,"string index out of range" );
	return NULL;
}

The PyErr_SetString function is part of the python API and is explained in the API reference. What these lines of code do is raise an exception if the argument is out of bounds. Otherwise, we may return a value:

return Py_BuildValue("c", ptr->at(a1));

Py_BuildValue is documented in the Extending and Embedding Python, 1.3, 1.7 P8-11.

Building It

So the problem we now have is to compile our code. Here's what works for me: I have a

Makefile,
my sipfile, and
a sipcode sub-directory. The sipcode directory includes a Makefile and automatically generated code.

The top level Makefile is pretty simple, it just runs sip and then builds in the sub-directory.

SIP=/usr/bin/sip

sip: $(OBJS)
	$(SIP) -s ".cc" -c sipcode string.sip
	cd sipcode && make

sip will generate the following files in this example:

sipStringDeclString.h: global declarations for the module
sipStringcmodule.cc: global definitions for the module
sipStringstdstring.h: declarations for the string class
sipStringstdstring.cc: definitions for the string class
String.py

Note the way the namespace is munged into the names. Now the Makefile to build this in the sipcode directory looks like this:

module=String
class=stdstring

objs=$(module)cmodule.o sip$(module)$(class).o

PYTHON_INCLUDES=-I/usr/include/python2.2
SIP_INCLUDES=-I/usr/local/include/sip
PYTHONLIBS=-L/usr/lib/python2.2/site-packages 

%.o: %.cc
	$(CXX) $(CXXFLAGS) -c -I.. -I. $(PYTHON_INCLUDES) $(SIP_INCLUDES)  $<
	
	
all: libs

libs:  $(objs)
	$(CXX) -shared $(PYTHONLIBS) -lsip -o lib$(module)cmodule.so *.o 


clean:
	rm -f *.o *.so *.cc *.h sip_helper *.py *.pyc

And that's it! Now we've got a String module that compiles. Installing it into the proper directories just involves copying the .so file and the .py file into the site-packages directory in your python installation.

Another Example: Fractions in Sip

An example that gives us a chance to play with more operators, and other finer points of sip, is that of implementing a fraction data type. Consider the fraction class with the declaration frac.h and member function definitions frac.cc. This class was written for completeness, and economy of the coders time (-; so a lot of operators are implemented in terms of others. To implement a python version of this, we redeclare the member functions, excluding operators:

Some comments about the class:

A destructor is implemented for debugging purposes. This helps us debug methods that manipulate reference counts.
The class throws exceptions. This is a problem for python. The problem could possibly be solved by deriving a class that intercepts operations that can throw, and propogates a python exception.

declaring the basic interface is straightforward:

class Fraction
{
public:
	Fraction(int , int=1 ); 
	Fraction(const Fraction&);
	int numerator() const;
	int denominator() const;

	/* we'll put more code here later */
};

int gcd(int, int);

The interesting part is declaring the operators. Implementing the arithmatic binary operators +,-,*,/ involves essentially the same code, and most of the code just does error conversion and type checking. It would be nice to break this boilerplate code off into its own function. The strategy we use is to have a function that takes a function pointer as an argument. The function pointer is to a function that invokes one of the operators +,-,*,/.

%HeaderCode
#include <frac.h>
typedef Fraction* (*binary_fraction_op_t) (Fraction*,Fraction*);
PyObject* BinaryOp(PyObject* a0, PyObject* a1, binary_fraction_op_t op);
Fraction* plus (Fraction* x, Fraction* y);
Fraction* minus (Fraction* x, Fraction* y);
Fraction* mult (Fraction* x, Fraction* y);
Fraction* div (Fraction* x, Fraction* y);
%End

Then we need to implement the body of these functions. The operator functions are very simple, they just perform the operation. Note that we use pointers all the time, because all our Fraction objects are heap allocated. Also note the use of sipNewCppToSelf(). This function is used to wrap a dynamically allocated (with new) C++ object in python. The arguments are the object to be wrapped, the PyObject* corresponding to the python class to be used, and flags (which are in practice always SIP_SIMPLE|SIP_PY_OWNED)

%C++Code

Fraction* plus (Fraction* x, Fraction* y) { return new Fraction(*x + *y); }
Fraction* minus (Fraction* x, Fraction* y) { return new Fraction(*x - *y); }
Fraction* mult (Fraction* x, Fraction* y) { return new Fraction(*x * *y); }
Fraction* div (Fraction* x, Fraction* y) { return new Fraction(*x / *y); }

PyObject* BinaryOp(PyObject* sipArgs, binary_fraction_op_t op)
{
	Fraction *ptr1, *ptr2;
	PyObject* res;
        int sipArgsParsed = 0;

	/** 
	  Extract operands and make sure that they are "really" fractions
	  */

        if (sipParseArgs(&sipArgsParsed,sipArgs,"J1J1",
                sipClass_Fraction, &ptr1,
                sipClass_Fraction, &ptr2
        )
        )
        {

	        /* ptr1 and ptr2 point to fractions */
	        return sipNewCppToSelf ( op(ptr1, ptr2), sipClass_Fraction,
                        SIP_SIMPLE | SIP_PY_OWNED );
        }
        return NULL;

}


%End

This makes implementing arithmatic the operators simpler. Note that all the binary arithmatic operators return type PyObject*, and take two arguments of type PyObject*. The functions that implement these operators are documented in the API reference, section 6.2, the number protocol. For example, the __add__ method corresponds with the function PyNumber_Add. While the function signatures are obvious for a lot of the data types, some of them like __coerce__ require one to read the documentation to understand how to implement them.

So here's the code to implement our operations:

void __add__() /NonLazy/;
%MemberCode
	return BinaryOp (sipArgs,plus);
%End

void __sub__ () /NonLazy/;
%MemberCode
	return BinaryOp (sipArgs,minus);
%End

void __mul__() /NonLazy/;
%MemberCode
	return BinaryOp (sipArgs,mult);
%End

void __div__() /NonLazy/;
%MemberCode
	return BinaryOp (sipArgs,div);
%End

We'd also like to permit explicit conversion to floating point numbers. We do this by implementing __float__. Note the python API function Py_BuildValue.

void __float__() /NonLazy/;
%MemberCode
	Fraction* ptr;
        if (sipParseArgs( &sipArgsParsed,sipArgs,"J1",sipClass_Fraction,&ptr ))
        {
	        double x = (double)ptr->numerator() / ptr->denominator();
	        return Py_BuildValue ( "d", x );
        }

%End

We're almost done. A desirable feature to make our fraction data type better interoperate with pythons numerical types would be an implementation of the "type promotion" feature, the __coerce__ method. Implementing this is a little tricky. Referring to the python API documentation on the Number Protocol, 6.2 P29, we see that:

int PyNumber_Coerce(PyObject **p1, PyObject **p2)
This function takes the addresses of two variables of type PyObject*. It the objects pointed to by *p1 and *p2 have the same type, increment their reference count and return 0 (success). If the objects can be converted to a common numeric type, replace *p1 and *p2 by their converted value (with 'new' reference counts), and return 0. If no conversion is possible, or if some other error occurs, return -1 (failure) and don't increment the reference counts. The call PyNumber_Coerce(&o1, & o2) is equivalent to the python statement 'o1,o2 = coerce(o1,o2)'

So, it's a good thing we read the documents. While one might reasonably guess the meaning of the return code, and deduce that the arguments are supposed to be overwritten with the return values, the reference counts are a potential trap. If we didn't read the document, our code would have segfaulted (like mine did when I was learning this!)

The basic aim of the coerce method then is to end up with two objects of the same type, and we know that the first is a fraction. We start by declaring some variables:

void __coerce__() /NonLazy/;
%MemberCode
	Fraction* ptr;
	long i;
	bool success = false;
        PyObject *a0, *a1;

a0,a1 store the two arguments. We need to extract these from the arguments and check their type (alternatively, we could use sipParseArgs, but I'd like to illustrate a different approach)

        a0 = PyTuple_GetItem ( sipArgs,0 );
        a1 = PyTuple_GetItem ( sipArgs,1 );

It's important to be aware of what python API functions do with reference counts. The Python documentation asserts that PyTuple_GetItem returns a borrowed reference which means that the reference count of that tuple element is not increased. Next, we make sure that the first object is of the right type (it should be!)

	if (!sipIsSubClassInstance(a0, sipClass_Fraction))
                return NULL; // this should not happen

Then we check to see if both objects are of the same type. We already know that *a0 is of type Fraction, we need to perform a check for *a1. To check types, we use a special function to check the type of *a1, sipIsSubClassInstance().

int sipIsSubClassInstance(PyObject *inst, PyObject *baseclass

This function returns a true value if inst is an object whose class is some derived type of baseclass

	if (sipIsSubClassInstance(a1, sipClass_Fraction))
	{
                return Py_BuildValue("(OO)",a0,a1);
	}

If the arguments are not both fractions, we need to check for integral types, and convert. We do this using the checking functions PyLong_Check() and PyInt_Check() and the conversions PyLong_AsLong() and PyInt_AsLong() (again, described in 6.1, API ref). Note that if these conversions are succesful, we increment the reference count of *a0. This has the same effect as returning a copy of *a0 with a reference count of 1.

	if ( PyLong_Check(*a1) )
	{
		success = true;
		i = PyLong_AsLong (*a1);
	}
	else if ( PyInt_Check(*a1) )
	{
		success = true;
		i = PyInt_AsLong (*a1);
	}


	if (success)
	{
                return Py_BuildValue("(Oi)",a0, i );

	}
	else 
	{
		return NULL; 
	}

And that's it. We now have a complete sip file An interesting exercise would be to implement other operators, and/or implement __coerce__ for floating point data types, but this is sufficient to get the reader started.

List of Sip Functions Used in Examples

const void *sipGetCppPtr (sipThisType*,PyObject *);
PyObject *sipMapCppToSelf (const void *,PyObject *);
PyObject *sipNewCppToSelf (const void *,PyObject *);
int sipIsSubClassInstance (PyObject *,PyObject *);

Converting Between C++ and Python Data

PyObject *sipNewCppToSelf (const void * object,PyObject * class, int flags);

Convert a C++ object to a Python object. This function is used to return values from functions. For the flags, you will nearly always want to use SIP_SIMPLE | SIP_PY_OWNED. This means that Python is responsible for managing the object. I am still not clear what SIP_SIMPLE means, but it has something to do with the internal representation of the object.

PyObject *sipMapCppToSelf (const void * object,PyObject * class);

Convert a C++ object to a Python object. This function is used to convert C++ objects to Python objects. C++ bears responsibility for deallocating the memory (so use sipNewCppToSelf to construct return values)

const void* sipGetCppPtr(sipThisType* object, PyObject* class);

Returns C++ class pointer. This function is unsafe, in that the way it's usually used involves a cast from a generic PyObject pointer to a sipThisType pointer. So it should only be used if the argument is known to be a sip object.

Type Checking

int sipIsSubClassInstance (PyObject * object,PyObject * class);

Check to see if object belongs to class or some subclass of class. This is useful for type checking, if you don't know anything about an objects type.

Python API Functions Used in/Related to Examples

These functions are all documented in the API reference, but are listed here for convenience. This is not meant to be comprehensive (for that, there's the Python API reference).

Numeric data types

Type Checking

int PyInt_Check(PyObject* o);
int PyLong_Check(PyObject* o);
int PyFloat_Check(PyObject* o);

Returns true value if o is respectively a python int, long or float object.

Type Conversion

int PyInt_AsLong(PyObject* o);
long PyInt_AsLong(PyObject* o);
longlong PyInt_AsLongLong(PyObject* o);
double PyLong_AsDouble(PyObject* o);
double PyFloat_AsDouble(PyObject* o);

Convert python objects. Several conversions are offered for Pythons long data type, because it is an arbitrary precision type. Conversions involving long raise OverflowError if unsuccesful.

Operator Overlading

Overlading numeric operators amounts to partially implementing pythons number protocol described in the API reference. In particular, one implements functions that implement the functionality documented in the API reference. The python functions delegate to the method table built by sip, so sip__add__classname does the work when __add__ (or PyNumber_Add, which has the same effect) is called. There are a lot of arithmatic operators, and they're all documented in the API reference. Here, we present the ones used in the examples.

Arithmatic

PyObject* PyNumber_Add(PyObject* left, PyObject* right);
PyObject* PyNumber_Subtract(PyObject* left, PyObject* right);
PyObject* PyNumber_Multiply(PyObject* left, PyObject* right);
PyObject* PyNumber_Divide(PyObject* left, PyObject* right);

Respectively add, subtract, multiply, and divide two python objects. Same as (respectively) the methods __add__, __sub__, __mul__, __div__ in Python. Methods should return NULL if unsuccesful. Python takes care of type checking and promotion to make sure both operands are of the same type.

Coercion

This is a little complex, so I will quote the API reference verbatim:

int PyNumber_Coerce(PyObject **p1, PyObject **p2)
This function takes the addresses of two variables of type PyObject*. It the objects pointed to by *p1 and *p2 have the same type, increment their reference count and return 0 (success). If the objects can be converted to a common numeric type, replace *p1 and *p2 by their converted value (with 'new' reference counts), and return 0. If no conversion is possible, or if some other error occurs, return -1 (failure) and don't increment the reference counts. The call PyNumber_Coerce(&o1& , o2) is equivalent to the python statement 'o1,o2 = coerce(o1,o2)'

Conversion

PyObject* PyNumber_Int(PyObject*);
PyObject* PyNumber_Long(PyObject*);
PyObject* PyNumber_Float(PyObject*);

implement the float(), int() and long() operators respectively.

Building Return Values

PyObject* Py_BuildValue(char* format, ... );

Construct a python object from C variables. Return NULL on failure. The specific rules re format are quite long and described in the Extending and Embedding guide.

Exceptions

PyErr_SetString(PyObject* exception, char* message);

Exceptions behave in a way that may seem strange to a C++ programmer. One throws an exception by "setting" an exception flag. The exception objects are defined in python (See API reference, P16 4.1 for a complete list). One usually throws a standard exception type.

Reference Counting

One should very rarely have to use any of these when writing sip bindings. They are needed occasionaly to implement a python method, eg __coerce__.

void Py_XINCREF(PyObject* o);
void Py_XDECREF(PyObject* o);

Respectively increment the reference count of o. If o is a null pointer, this has no effect.

Eliplogue: Deriving From QWidget

Deriving from QWidget proves a tricky problem, because sip needs to be informed that the QWidget class has been defined elsewhere in python. To do this, one needs to use the %Import directive in sip. The other aspect of it that is quite tricky is invoking sip correctly, and finding the sipQtFeatures.h file. I've included a simple (perhaps even trivial), but working, example.