API¶

Transformation
Filtering
Inspection

Transformation¶

fntools.use_with(data, fn, *attrs)[source]¶

Apply a function on the attributes of the data

Parameters:	data – an object fn – a function attrs – some attributes of the object
Returns:	an object

Let’s create some data first:

>>> from collections import namedtuple
>>> Person = namedtuple('Person', ('name', 'age', 'gender'))
>>> alice = Person('Alice', 30, 'F')

Usage:

>>> make_csv_row = lambda n, a, g: '%s,%d,%s' % (n, a, g)
>>> use_with(alice, make_csv_row, 'name', 'age', 'gender')
'Alice,30,F'

fntools.zip_with(fn, *colls)[source]¶

Return the result of the function applied on the zip of the collections

Parameters:	fn – a function colls – collections
Returns:	an iterator

>>> list(zip_with(lambda x, y: x-y, [10, 20, 30], [42, 19, 43]))
[-32, 1, -13]

fntools.unzip(colls)[source]¶

Unzip collections

Parameters:	colls – collections
Returns:	unzipped collections

>>> unzip([[1, 2, 3], [10, 20, 30], [100, 200, 300]])
[(1, 10, 100), (2, 20, 200), (3, 30, 300)]

fntools.concat(colls)[source]¶

Concatenate a list of collections

Parameters:	colls – a list of collections
Returns:	the concatenation of the collections

>>> concat(([1, 2], [3, 4]))
[1, 2, 3, 4]

fntools.mapcat(fn, colls)[source]¶

Concatenate the result of a map

Parameters:	fn – a function colls – a list of collections
Returns:	a list

>>> mapcat(reversed,  [[3, 2, 1, 0], [6, 5, 4], [9, 8, 7]])
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

fntools.dmap(fn, record)[source]¶

map for a directory

Parameters:	fn – a function record – a dictionary
Returns:	a dictionary

>>> grades = [{'math': 13, 'biology': 17, 'chemistry': 18},
... {'math': 15, 'biology': 12, 'chemistry': 13},
... {'math': 16, 'biology': 17, 'chemistry': 11}]

>>> def is_greater_than(x):
...     def func(y):
...         return y > x
...     return func

>>> dmap(is_greater_than(15), grades[0])
{'biology': True, 'chemistry': True, 'math': False}

fntools.rmap(fn, coll, is_iterable=None)[source]¶

A recursive map

Parameters:	fn – a function coll – a list isiterable – a predicate function determining whether a value is iterable.
Returns:	a list

>>> rmap(lambda x: 2*x, [1, 2, [3, 4]])
[2, 4, [6, 8]]

fntools.replace(x, old, new, fn=<built-in function eq>)[source]¶

Replace x with new if fn(x, old) is True.

Parameters:	x – Any value old – The old value we want to replace new – The value replacing old fn – The predicate function determining the relation between x and old. By default fn is the equality function.
Returns:	x or new

>>> map(lambda x: replace(x, None, -1), [None, 1, 2, None])
[-1, 1, 2, -1]

fntools.compose(*fns)[source]¶

Return the function composed with the given functions

Parameters:	fns – functions
Returns:	a function

>>> add2 = lambda x: x+2
>>> mult3 = lambda x: x*3
>>> new_fn = compose(add2, mult3)
>>> new_fn(2)
8

Note

compose(fn1, fn2, fn3) is the same as fn1(fn2(fn3)) which means that the last function provided is the first to be applied.

fntools.groupby(fn, coll)[source]¶

Group elements in sub-collections by fn

Parameters:	fn – a function coll – a collection
Returns:	a dictionary

>>> groupby(len, ['John', 'Terry', 'Eric', 'Graham', 'Mickael'])
{4: ['John', 'Eric'], 5: ['Terry'], 6: ['Graham'], 7: ['Mickael']}

fntools.reductions(fn, seq, acc=None)[source]¶

Return the intermediate values of a reduction

Parameters:	fn – a function seq – a sequence acc – the accumulator
Returns:	a list

>>> reductions(lambda x, y: x + y, [1, 2, 3])
[1, 3, 6]

>>> reductions(lambda x, y: x + y, [1, 2, 3], 10)
[11, 13, 16]

fntools.split(coll, factor)[source]¶

Split a collection by using a factor

Parameters:	coll – a collection factor – a collection of factors
Returns:	a dictionary

>>> bands = ('Led Zeppelin', 'Debussy', 'Metallica', 'Iron Maiden', 'Bach')
>>> styles = ('rock', 'classic', 'rock', 'rock', 'classic')
>>> split(bands, styles)
{'classic': ['Debussy', 'Bach'], 'rock': ['Led Zeppelin', 'Metallica', 'Iron Maiden']}

fntools.assoc(_d, key, value)[source]¶

Associate a key with a value in a dictionary

Parameters:	_d – a dictionary key – a key in the dictionary value – a value for the key
Returns:	a new dictionary

>>> data = {}
>>> new_data = assoc(data, 'name', 'Holy Grail')
>>> new_data
{'name': 'Holy Grail'}
>>> data
{}

Note

the original dictionary is not modified

fntools.dispatch(data, fns)[source]¶

Apply the functions on the data

Parameters:	data – the data fns – a list of functions
Returns:	a collection

>>> x = (1, 42, 5, 79)
>>> dispatch(x, (min, max))
[1, 79]

fntools.multimap(fn, colls)[source]¶

Apply a function on multiple collections

Parameters:	fn – a function colls – collections
Returns:	a collection

>>> multimap(operator.add, ((1, 2, 3), (4, 5, 6)))
[5, 7, 9]

>>> f = lambda x, y, z: 2*x + 3*y - z
>>> result = multimap(f, ((1, 2), (4, 1), (1, 1)))
>>> result[0] == f(1, 4, 1)
True
>>> result[1] == f(2, 1, 1)
True

fntools.multistarmap(fn, *colls)[source]¶

Apply a function on multiple collections

Parameters:	fn – a function colls – collections
Returns:	a collection

>>> multistarmap(operator.add, (1, 2, 3), (4, 5, 6))
[5, 7, 9]

>>> f = lambda x, y, z: 2*x + 3*y - z
>>> result = multistarmap(f, (1, 2), (4, 1), (1, 1))
>>> result[0] == f(1, 4, 1)
True
>>> result[1] == f(2, 1, 1)
True

fntools.pipe(data, *fns)[source]¶

Apply functions recursively on your data

Parameters:	data – the data fns – functions
Returns:	an object

>>> inc = lambda x: x + 1
>>> pipe(42, inc, str)
'43'

fntools.pipe_each(coll, *fns)[source]¶

Apply functions recursively on your collection of data

Parameters:	coll – a collection fns – functions
Returns:	a list

>>> inc = lambda x: x + 1
>>> pipe_each([0, 1, 1, 2, 3, 5], inc, str)
['1', '2', '2', '3', '4', '6']

fntools.shift(func, *args, **kwargs)[source]¶

This function is basically a beefed up lambda x: func(x, *args, **kwargs)

shift() comes in handy when it is used in a pipeline with a function that needs the passed value as its first argument.

Parameters:	func – a function args – objects kwargs – keywords

>>> def div(x, y): return float(x) / y

This is equivalent to div(42, 2):

>>> shift(div, 2)(42)
21.0

which is different from div(2, 42):

>>> from functools import partial
>>> partial(div, 2)(42)
0.047619047619047616

fntools.repeatedly(func)[source]¶

Repeat a function taking no argument

Parameters:	func – a function
Returns:	a generator

>>> import random as rd
>>> rd.seed(123)
>>> take(3, repeatedly(rd.random))
[0.052363598850944326, 0.08718667752263232, 0.4072417636703983]

fntools.update(records, column, values)[source]¶

Update the column of records

Parameters:	records – a list of dictionaries column – a string values – an iterable or a function
Returns:	new records with the columns updated

>>> movies = [
... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6},
... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6},
... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6}
... ]
>>> new_movies = update(movies, 'budget', lambda x: 2*x)
>>> [new_movies[i]['budget'] for i,_ in enumerate(movies)]
[800000.0, 8000000.0, 18000000.0]

>>> new_movies2 = update(movies, 'budget', (40, 400, 900))
>>> [new_movies2[i]['budget'] for i,_ in enumerate(movies)]
[40, 400, 900]

fntools.use(data, attrs)[source]¶

Return the values of the attributes for the given data

Parameters:	data – the data attrs – strings
Returns:	a list

With a dict:

>>> band = {'name': 'Metallica', 'singer': 'James Hetfield', 'guitarist': 'Kirk Hammet'}
>>> use(band, ('name', 'date', 'singer'))
['Metallica', None, 'James Hetfield']

With a non dict data structure:

>>> from collections import namedtuple
>>> Person = namedtuple('Person', ('name', 'age', 'gender'))
>>> alice = Person('Alice', 30, 'F')
>>> use(alice, ('name', 'gender'))
['Alice', 'F']

fntools.get_in(record, *keys, **kwargs)[source]¶

Return the value corresponding to the keys in a nested record

Parameters:	record – a dictionary keys – strings kwargs – keywords
Returns:	the value for the keys

>>> d = {'id': {'name': 'Lancelot', 'actor': 'John Cleese', 'color': 'blue'}}
>>> get_in(d, 'id', 'name')
'Lancelot'

>>> get_in(d, 'id', 'age', default='?')
'?'

fntools.valueof(records, key)[source]¶

Extract the value corresponding to the given key in all the dictionaries

>>> bands = [{'name': 'Led Zeppelin', 'singer': 'Robert Plant', 'guitarist': 'Jimmy Page'},
... {'name': 'Metallica', 'singer': 'James Hetfield', 'guitarist': 'Kirk Hammet'}]
>>> valueof(bands, 'singer')
['Robert Plant', 'James Hetfield']

Filtering¶

fntools.duplicates(coll)[source]¶

Return the duplicated items in the given collection

Parameters:	coll – a collection
Returns:	a list of the duplicated items in the collection

>>> duplicates([1, 1, 2, 3, 3, 4, 1, 1])
[1, 3]

fntools.pluck(record, *keys, **kwargs)[source]¶

Return the record with the selected keys

Parameters:	record – a list of dictionaries keys – some keys from the record kwargs – keywords determining how to deal with the keys

>>> d = {'name': 'Lancelot', 'actor': 'John Cleese', 'color': 'blue'}
>>> pluck(d, 'name', 'color')
{'color': 'blue', 'name': 'Lancelot'}

The keyword ‘default’ allows to replace a None value:

>>> d = {'year': 2014, 'movie': 'Bilbo'}
>>> pluck(d, 'year', 'movie', 'nb_aliens', default=0)
{'movie': 'Bilbo', 'nb_aliens': 0, 'year': 2014}

fntools.pluck_each(records, columns)[source]¶

Return the records with the selected columns

Parameters:	records – a list of dictionaries columns – a list or a tuple
Returns:	a list of dictionaries with the selected columns

>>> movies = [
... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6},
... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6},
... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6}
... ]
>>> pluck_each(movies, ('title', 'year'))
[{'year': 1975, 'title': 'The Holy Grail'}, {'year': 1979, 'title': 'Life of Brian'}, {'year': 1983, 'title': 'The Meaning of Life'}]

fntools.take(n, seq)[source]¶

Return the n first items in the sequence

Parameters:	n – an integer seq – a sequence
Returns:	a list

>>> take(3, xrange(10000))
[0, 1, 2]

fntools.drop(n, seq)[source]¶

Return the n last items in the sequence

Parameters:	n – an integer seq – a sequence
Returns:	a list

>>> drop(9997, xrange(10000))
[9997, 9998, 9999]

fntools.find(fn, record)[source]¶

Apply a function on the record and return the corresponding new record

Parameters:	fn – a function record – a dictionary
Returns:	a dictionary

>>> find(max, {'Terry': 30, 'Graham': 35, 'John': 27})
{'Graham': 35}

fntools.find_each(fn, records)[source]¶

Apply a function on the records and return the corresponding new record

Parameters:	fn – a function records – a collection of dictionaries
Returns:	new records

>>> grades = [{'math': 13, 'biology': 17, 'chemistry': 18},
... {'math': 15, 'biology': 12, 'chemistry': 13},
... {'math': 16, 'biology': 17, 'chemistry': 11}]
>>> find_each(max, grades)
[{'chemistry': 18}, {'math': 15}, {'biology': 17}]

fntools.dfilter(fn, record)[source]¶

filter for a directory

Parameters:	fn – A predicate function record – a dict
Returns:	a dict

>>> odd = lambda x: x % 2 != 0
>>> dfilter(odd, {'Terry': 30, 'Graham': 35, 'John': 27})
{'John': 27, 'Graham': 35}

fntools.remove(coll, value)[source]¶

Remove all the occurrences of a given value

Parameters:	coll – a collection value – the value to remove
Returns:	a list

>>> data = ('NA', 0, 1, 'NA', 1, 2, 3, 'NA', 5)
>>> remove(data, 'NA')
(0, 1, 1, 2, 3, 5)

Inspection¶

fntools.isiterable(coll)[source]¶

Return True if the collection is any iterable except a string

Parameters:	coll – a collection
Returns:	a boolean

>>> isiterable(1)
False
>>> isiterable('iterable')
False
>>> isiterable([1, 2, 3])
True

fntools.are_in(items, collection)[source]¶

Return True for each item in the collection

Parameters:	items – a sub-collection collection – a collection
Returns:	a list of booleans

>>> are_in(['Terry', 'James'], ['Terry', 'John', 'Eric'])
[True, False]

fntools.any_in(items, collection)[source]¶

Return True if any of the items are in the collection

Parameters:	items – items that may be in the collection collection – a collection
Returns:	a boolean

>>> any_in(2, [1, 3, 2])
True
>>> any_in([1, 2], [1, 3, 2])
True
>>> any_in([1, 2], [1, 3])
True

fntools.all_in(items, collection)[source]¶

Return True if all of the items are in the collection

Parameters:	items – items that may be in the collection collection – a collection
Returns:	a boolean

>>> all_in(2, [1, 3, 2])
True
>>> all_in([1, 2], [1, 3, 2])
True
>>> all_in([1, 2], [1, 3])
False

fntools.monotony(seq)[source]¶

Determine the monotony of a sequence

Parameters:	seq – a sequence
Returns:	1 if the sequence is sorted (increasing)
Returns:	0 if it is not sorted
Returns:	-1 if it is sorted in reverse order (decreasing)

>>> monotony([1, 2, 3])
1
>>> monotony([1, 3, 2])
0
>>> monotony([3, 2, 1])
-1

fntools.occurrences(coll, value=None, **options)[source]¶

Return the occurrences of the elements in the collection

Parameters:	coll – a collection value – a value in the collection options – an optional keyword used as a criterion to filter the values in the collection
Returns:	the frequency of the values in the collection as a dictionary

>>> occurrences((1, 1, 2, 3))
{1: 2, 2: 1, 3: 1}
>>> occurrences((1, 1, 2, 3), 1)
2

Filter the values of the occurrences that are <, <=, >, >=, == or != than a given number:

>>> occurrences((1, 1, 2, 3), lt=3)
{1: 2, 2: 1, 3: 1}
>>> occurrences((1, 1, 2, 3), gt=1)
{1: 2}
>>> occurrences((1, 1, 2, 3), ne=1)
{1: 2}

fntools.attributes(data)[source]¶

Return all the non callable and non special attributes of the input data

Parameters:	data – an object
Returns:	a list

>>> class table:
...     def __init__(self, name, rows, cols):
...         self.name = name
...         self.rows = rows
...         self.cols = cols

>>> t = table('people', 100, 3)
>>> attributes(t)
['cols', 'name', 'rows']

fntools.indexof(coll, item, start=0, default=None)[source]¶

Return the index of the item in the collection

Parameters:	coll – iterable item – scalar start – (optional) The start index
Default:	The default value of the index if the item is not in the collection
Returns:	idx – The index of the item in the collection

>>> monties = ['Eric', 'John', 'Terry', 'Terry', 'Graham', 'Mickael']
>>> indexof(monties, 'Terry')
2

>>> indexof(monties, 'Terry', start=3)
3

>>> indexof(monties, 'Terry', start=4) is None
True

fntools.indexesof(coll, item)[source]¶

Return all the indexes of the item in the collection

Parameters:	coll – the collection item – a value
Returns:	a list of indexes

>>> monties = ['Eric', 'John', 'Terry', 'Terry', 'Graham', 'Mickael']
>>> indexesof(monties, 'Terry')
[2, 3]

fntools.count(fn, coll)[source]¶

Return the count of True values returned by the predicate function applied to the collection

Parameters:	fn – a predicate function coll – a collection
Returns:	an integer

>>> count(lambda x: x % 2 == 0, [11, 22, 31, 24, 15])
2

fntools.isdistinct(coll)[source]¶

Return True if all the items in the collections are distinct.

Parameters:	coll – a collection
Returns:	a boolean

>>> isdistinct([1, 2, 3])
True
>>> isdistinct([1, 2, 2])
False

fntools.nrow(records)[source]¶

Return the number of rows in the records

Parameters:	records – a list of dictionaries
Returns:	an integer

>>> movies = [
... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6},
... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6},
... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6}
... ]
>>> nrow(movies)
3

fntools.ncol(records)[source]¶

Return the number of columns in the records

Parameters:	records – a list of dictionaries
Returns:	an integer

>>> movies = [
... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6},
... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6},
... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6}
... ]
>>> ncol(movies)
4

fntools.names(records)[source]¶

Return the column names of the records

Parameters:	records – a list of dictionaries
Returns:	a list of strings

>>> movies = [
... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6},
... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6},
... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6}
... ]
>>> names(movies)
['total_gross', 'year', 'budget', 'title']