Welcome to HBH! If you had an account on hellboundhacker.org you will need to reset your password using the Lost Password system before you will be able to login.

Functional Python. Map, filter, lambda.


Functional Python. Map, filter, lambda.

By ghostghost | 5901 Reads |
0     0

OK, first things first. This may be my first article on python, but it shouldn't be yours. This article assumes that you have working knowledge of the 'os' module in python, and in defining your own functions.

First defining functions. Let's start by simply defining a function that prints whatever we pass too it: def printf(foo): [tab]print str(foo)

Next using other peoples functions. lets import os then we'll os.listdir('c:\\') #double slash for syntax reasons os.listdir() is a function that returns a list of everything in a directory.

OK. Everything up to here has been simple. But nothing up to here has been what we'd call functional programing. You see, functional programing is all about stringing functions like these two together. Python, wonderful language that it is, provides us with some really awesome tools for doing this. The first is map.

map(function,list) The map technique allows us to apply a function to every item in a list. Suppose we wanted to print out the contents of a directory. Remember the function we defined earlier, and the one we used. Instead of a tedious for loop, we can use map to combine them. Just type in: map(printf,os.listdir('c:\somedir\\'))

But what if we wanted to skip the junk, and go straight for the .avi files, for example. We could define, for example: def isAvi(strFile): [tab]if strFile[:-4] =='.avi:return True There. This little function will return True if the last four characters of a string are .avi. This is simple and crude, but it works. And now, it's time to introduce filter().

filter(true or false function,list) Filter is a lot like map. Like map, it applies a function to a list. However, filter also returns a list, and the function used must return either true or false. If it returns true for an item, the item is in the list filter returns. Otherwise, it is not. We can combine filter with os.listdir() and isavi() for our example. filter(isavi,os.listdir('c:\somedirwithavifiles')) will reutrn a list of all the items in a dir that end with '.avi'.

Now, let's suppose that we wanted our code to be a little more flexable. Instead of just avi's we want to be able to find anything in a directory. First we could define: #does string this contain string that? def contains(strThis,strThat): [tab]if strThat in strThis:return True Then we could ask each time, what you're looking for, and where to look strSearchfor=raw_input('What do you seek?') strSearchhere=raw_input('Where shall I search?') Easy enough, but there's a problem when it comes to applying filter to our new function. Until now, we've only used filter with functions that have one input. Now there are two, how will it know which is which? In truth, filter, and map, will only pass one variable to a function. We need to create a wrapper function, to combine strSearchfor with the results from os.listdir(). It's time to introduce lambda.

lambda x: return x%2==0 Lambda allows us to define throwaway functions, functions without names, with only one line. For example, the following bit of code will disagree with you: filter(lambda y: return not y,[True,False,False,True]) and this will return all the even numbers in lstNumbers filter(lambda q: return q%2==0,lstNumbers) In our example, we'll be using it to glue our program together. filter(lambda x: contains(strSearchfor,x),os.listdir(strSearchhere)) Filter will pass an item on the os.listdir list to the lambda as x. the lambda passes x, along with what we're searching for, to contains()

Ok, at this point I want to make sure you're familiar with os.path.join, isfile, and isdir. Join will combine two items into a path, which you can use in other functions. isfile and isdir test if an item exists, and if it's a file or directory.

Recursive(strFoo,intCount=4) [tab]Recursive(strFoo,intCount-1) Recursive functions is a concept in programming, and some of you might be familiar with it. It's not specific to any language, however certain languages choose not to allow it. A recursive function is one that calls itself. For example, if I wanted to print out the numbers 1-5, i could define the following function. def RecursiveCount(intCount): [tab]if intCount<0:return [tab]print intCount [tab]RecursiveCount(intCount-1)

and pass it the number five. It will print 5, and call itself with 5-1. It's called self will print 4, and call itself with 4-1. It's called self… well, you get the idea. Recursive functions can be really tricky to work with. At the same time, they can be an incredibly powerful way of thinking. Any problem that involves breaking the same problem down again and again is likely well suited for a recursive solution.

Bringing it all together. OK, now we have all the pieces. You remember in the intro, I told you we'd be writing a program that searches an entire filesystem in under 10 lines of code. Well, now we have all the pieces we need to do this. Truth be told, we already have most of the code too. We just need to glue it all together, into one of the least readable, most compact and amazing pieces of code ever. It's also amazingly unreadable, very uber geek, and so I'm going to break it down for you line for line. Unless I say otherwise, we're still on line 1. All code lines here start with ':'

Line 1. If you don't understand line 1, why are you this far into the article? :import os

Line 2 Nothing special :strSearchFor=raw_input('Filenames containing what string?')

Line 3 is a bit more, but nothing new. :def contains(strThis,strThat):

Line 4, again, nothing special [tab]if strThat in strThis:print strThis

Line 5 Here we define our main function, which takes three variables. strSearchFor is the filename we're searching for. strRoot is the directory we're looking in, or where we'll start searching. intCount is depth to which we will search the filesystem. :def RecursiveDir(strSearchFor,strRoot,intCount=5)

Line 6 If we should look 0 directories deeper, stop. :[tab]if intCount<=0:return

Line 7 OK. Here we are, the down and dirty part. This line is absurdly long, but if we break it down, it'll make sense. :[tab]map(lambda q: contains(os.path.join(strRoot,q),strSearchFor),filter(lambda x: os.path.isfile(os.path.join(strRoot,x)),os.listdir(strRoot)))

Line 7 breakdown map(FOO,BAR) This line calls map to apply function FOO to list BAR . Simple; now let's break down FOO and BAR BAR is filter(lambda x: os.path.isfile(os.path.join(strRoot,x)),os.listdir(strRoot) Alright, we know that filter() filters a list based on the truth of a function. The list in this case is simple, os.listdir(strRoot) provides a list of everything in the directory. As for the function here, we use lambda to pass items from that list (x), use os.path.join to combine it into a full path with strRoot, and then check if it's a file using os.path.isfile(). If isfile returns true, q is passed on to FOO. FOO is lambda q:contains(os.path.join(strRoot,q),strSearchFor) The map passes every item in BAR on to FOO, which uses a lambda to combine it (q) with strRoot, which is passed with strSearchFor to the contains() function we defined earlier. OK, in this one line, we search through every file in the root directory (strRoot), and check if it's name contains what we're looking for. But what if the file is inside another directory. That's where recursive ideas, os.path.isdir, and Line 8 come into play.

Line 8 Again, I'm gonna break this down into the two components of the main map(FUNC,LIST) :[tab]map(lambda z:RecursiveDir(os.path.join(strRoot,z),intLimit-1),filter(lambda y: os.path.isdir(os.path.join(strRoot,y)),os.listdir(strRoot))) The LIST in this case is filter(lambda y: os.path.isdir(os.path.join(strRoot,y)),os.listdir(strRoot)) Here we're using os.listdir to provide the base list again, but we're filtering it based whether or not the items in it are directories. and the FUNCtion is lambda z: RecursiveDir(strSearchFor,os.path.join(strRoot,z),intCount-1) So, at this point, we have a list of directories. Probably want to search through them, right? What we can do is actually call another instance of RecursiveDir to for each subdirectory. We use lambda to pass the directory (z) on to os.path.join, which combines it with strRoot (the directory we're looking through) and passes that along to RecursiveDir. It also passes another variable along, intCount-1. This is to let the new RecursiveDir know how much deeper it should look.

Line 9 We're done. Just call the new function :RecursiveDir(strSearchFor,'c:\\',5) Will look 5 directories deep, starting at c:\, for filenames containing strSearchFor

Now, here's the source code for the entire application, only 9 lines long. :import os :strSearchFor=raw_input('Filenames containing what string?') :def contains(strThis,strThat): :[tab]if strThat in strThis: print strThis :def RecursiveDir(strSearchFor,strRoot,intCount=5): :[tab]if intCount<=0:return :[tab]map(lambda q: contains(os.path.join(strRoot,q),strSearchFor),filter(lambda x: os.path.isfile(os.path.join(strRoot,x)),os.listdir(strRoot))) :[tab]map(lambda z: RecursiveDir(strSearchFor,os.path.join(strRoot,z),intCount-1),filter(lambda y: os.path.isdir(os.path.join(strRoot,y)),os.listdir(strRoot))) :RecursiveDir(strSearchFor,'c:\\',3)

Thanks for reading along, and I hope you learned something about functional programming. I'll probably continue with intermediate level python tutorials for a while, helping people move past python as a scripting language, and into python as an means of quickly developing full fledged applications. Next up, maybe something on threading.

Digital Chameleon

Comments
ghost's avatar
ghost 16 years ago

Good article. You missed an apostrophe in the code below. Nothing to fret about. :)

[tab]if strFile[:-4] =='.avi:return True