SVMpython
High Level ViewIntroductionDownloadBuildingUsingOverview of svm_python_learn Overview of svm_python_classify Low Level ReferenceObjectsDetails of User Functionssvmlight Extension ModuleSpecial ParametersExample Module multiclass
Put simply, SVMpython isSVMstruct, except that all of the C API functions that the user normally has to implement (except those dealing with C specific problems) instead call a function of the same name in aPython module. You can write an SVMstruct instance in Python without having to author any C code. SVMpython tries to stay close to SVMstruct in naming convensions and other behavior, but knowledge of SVMstruct‘s original C implementation is not required.
This document contains a general overview in the first few sections as well as a more detailed reference in later sections for SVMpython. If you‘re already familiar with SVMpython, it‘s possible to get a pretty good idea of how to use the package merely by browsing through svmstruct.py and multiclass.py. This document provides a more in depth view of how to use the package.
Note that this is not a conversion of SVMstruct to Python. It is merely anembedding of Python in existing C code. All code other than the user implemented API functions is still in C, including optimization.
SVMlight is the basic underlying SVM learner, SVMstruct a general framework to learn complex output spaces built upon SVMlight for which one would write instantiations to learn in a particular setting, and SVMpython extends SVMstruct to allow such instantiations to be written in Python instead of in C. In SVMstruct, the user implement various functions in the svm_struct_api.c file, which the underlying SVMstruct code calls in order to learn a task. The intention of SVMstruct is that the underlying code is constant, and all that a user needs to change is within svm_struct_api.c and svm_struct_api_type.h. SVMpython works the same way, except all the functions that are to be implemented are instead implemented in a Python module (a .py file), and all these functions in svm_struct_api.c are instead glue code to call their embedded Python equivalents from the module, and all the types in svm_struct_api_type.h contain Python objects. The intention of SVMpython is that is that the C code stays constant and the user writes new and modifies Python modules to implement specific tasks.
The primary advantages are that Python tends to be easier and faster to code than C, less resistant to change and code reorganization, tends to be many times more compact, there‘s no explicit memory management, and Python‘s object oriented-ness means that some tedious tasks in SVMstruct can be easily replaced with default built in behavior.
My favorite example of this last point is that, since Python objects can be assigned any attribute, and since many Python objects are easily serializable with thepickle module, adding a field to the struct-model in Python code consists of a simple assignment like sm.foo = 5 at some point, and that‘s it. Using SVMstruct, one would add a field to the relevant struct, add an assignment, add code to write it to a model file, add code to parse it from a model file, and then test it to make sure all these little changes work well with each other.
The primary disadvantage to using SVMpython is that it is slower than equivalent C code. For example, considering the time outside of SVM optimization, the Python implementation of multiclass classification takes 9 times the time asSVMmulticlass. However, on this task SVM optimization takes about 99.5% of the time anyway, so the increase is often negligible.
You may download the package from here:svm-python.tar.bz2 To build this, a simple make should do it, unless the Python library you want to use is not the library corresponding to the Python interpreter you get when you just type python.
You might want to modify the Makefile to modify the PYTHON variable to the path of the desired interpreter. When you install Python, you install a library and an interpreter. This interpreter is able to output where its corresponding library is stored. The Makefile calls the Python interpreter to get this information, as well as other important information relevant to building a C application with embedded Python. You can specify the path of your desired interpreter by setting PYTHON to something other than python.
When you build, the program will produce two executables, svm_python_learn for learning a model and svm_python_classify for classification with a learned model.
I have tried building SVMpython with both Python 2.3 and 2.4 on OS X and Linux. Obviously, if the Python module you want to use usess features specific to Python 2.4 (like generator expressions or the long overdue sorted) you wouldn‘t be able to use the module with an SVMpython built against the Python 2.3 library.
One annoying detail of embedded Python is that your PYTHONPATH environment variable has to contain "." so the executable knows where to look for the module to load.
The file svmstruct.py is a Python module, and also contains documentation on all the functions which the C code may attempt to call. This is a good place to start reading if you are already familiar with SVMstruct and want to get familiar with how to build a SVMpython Python module. This describes what each function should do and, for non-required functions, describes the default behavior that happens if you don‘t implement them. The multiclass.py file is an example implementation of multiclass classification in Python.
Once you‘ve written a Python module in the file foo.py based on svmstruct.py and you want to use SVMpython with this module, you would use the following command line commands to learn a model and classify with a model respectively.
./svm_python_learn --m foo [options]
./svm_python_classify --m foo [options]