Python: How is the built-in method str.strip() implemented?

str.strip()

Return a copy of the string with the leading and trailing whitespace removed.  # "   hello   " => "hello"


Most of Python's built-in types are written in C. They're in the Objects directory [1]

The stringobject.c file contains the implementations of string methods, and the string_strip function provides some details of the underlying functions used to provide the strip method's functionality. [2]

The string_strip function [3] pointed to the do_strip function, which did all the real work. Here's the do_strip function:

Py_LOCAL_INLINE(PyObject *)
do_strip(PyStringObject *self, int striptype)
 {
    /* returns a NUL-terminated representation of the contents of string: */
    char *s = PyString_AS_STRING(self); 
    /* returns the length of the string: */
    Py_ssize_t len = PyString_GET_SIZE(self), i, j;
 
    i = 0;
    if (striptype != RIGHTSTRIP) {
        while (i < len && isspace(Py_CHARMASK(s[i]))) {
            i++;
        }
    }

    j = len;
    if (striptype != LEFTSTRIP) {
         do {
            j--;
        } while (j >= i && isspace(Py_CHARMASK(s[j])));
        j++;
    }

    if (i == 0 && j == len && PyString_CheckExact(self)) {
        Py_INCREF(self);
         return (PyObject*)self;
    }
    else
        /* returns a new string object with the value arg1 and length arg2: */
        return PyString_FromStringAndSize(s+i, j-i);
}

The C api for the string object helped me figure out a lot of the special functions in this code. I translated it into a working Python example to understand it better.

The above code, in Python:

import string  # This is only used on the fourth line, to get whitespace chars.

def isspace(ch):
    return ch in string.whitespace

def do_strip(string, striptype="both"):
     length = len(string)

    i = 0
    if striptype != "right":
        while (i < length and isspace(string[i])):
            i += 1

    j = length
    if striptype != "left":
         j -= 1
        while (j >= i and isspace(string[j])):
            j -= 1
        j += 1

    if (i == 0 and j == length and isstring(string)):
        return string

    else:
        return string[i:j]
 
print do_strip("   hi   ")  # returns "hi"

And then there's the much more elegant string.strip() method from Python 1.5.2:

def strip(s):
    i, j = 0, len(s)
    while i < j and s[i] in whitespace: i = i+1
    while i < j and s[j-1] in whitespace: j = j-1
    return s[i:j]

[1] You can download Python's source files here: http://www.python.org/download/

[2] Paul Boddie told me this on the Asking For Help section of the Python Wiki

[3]

 static PyObject *
 string_strip(PyStringObject *self, PyObject *args)
 {
     if (PyTuple_GET_SIZE(args) == 0)
         return do_strip(self, BOTHSTRIP); /* Common case */
     else
         return do_argstrip(self, BOTHSTRIP, args);
 }
 

Comments (0)

Leave a comment...

About

web developer/designer; python, html, css/sass, jquery.

i like exploring:
digging deeper, deeper, deeper, then BAM pieces start to fit together.

currently: http://www.storylog.com/