API¶
Transformation¶
-
fntools.
use_with
(data, fn, *attrs)[source]¶ Apply a function on the attributes of the data
Parameters: - data – an object
- fn – a function
- attrs – some attributes of the object
Returns: an object
Let’s create some data first:
>>> from collections import namedtuple >>> Person = namedtuple('Person', ('name', 'age', 'gender')) >>> alice = Person('Alice', 30, 'F')
Usage:
>>> make_csv_row = lambda n, a, g: '%s,%d,%s' % (n, a, g) >>> use_with(alice, make_csv_row, 'name', 'age', 'gender') 'Alice,30,F'
-
fntools.
zip_with
(fn, *colls)[source]¶ Return the result of the function applied on the zip of the collections
Parameters: - fn – a function
- colls – collections
Returns: an iterator
>>> list(zip_with(lambda x, y: x-y, [10, 20, 30], [42, 19, 43])) [-32, 1, -13]
-
fntools.
unzip
(colls)[source]¶ Unzip collections
Parameters: colls – collections Returns: unzipped collections >>> unzip([[1, 2, 3], [10, 20, 30], [100, 200, 300]]) [(1, 10, 100), (2, 20, 200), (3, 30, 300)]
-
fntools.
concat
(colls)[source]¶ Concatenate a list of collections
Parameters: colls – a list of collections Returns: the concatenation of the collections >>> concat(([1, 2], [3, 4])) [1, 2, 3, 4]
-
fntools.
mapcat
(fn, colls)[source]¶ Concatenate the result of a map
Parameters: - fn – a function
- colls – a list of collections
Returns: a list
>>> mapcat(reversed, [[3, 2, 1, 0], [6, 5, 4], [9, 8, 7]]) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
-
fntools.
dmap
(fn, record)[source]¶ map for a directory
Parameters: - fn – a function
- record – a dictionary
Returns: a dictionary
>>> grades = [{'math': 13, 'biology': 17, 'chemistry': 18}, ... {'math': 15, 'biology': 12, 'chemistry': 13}, ... {'math': 16, 'biology': 17, 'chemistry': 11}]
>>> def is_greater_than(x): ... def func(y): ... return y > x ... return func
>>> dmap(is_greater_than(15), grades[0]) {'biology': True, 'chemistry': True, 'math': False}
-
fntools.
rmap
(fn, coll, is_iterable=None)[source]¶ A recursive map
Parameters: - fn – a function
- coll – a list
- isiterable – a predicate function determining whether a value is iterable.
Returns: a list
>>> rmap(lambda x: 2*x, [1, 2, [3, 4]]) [2, 4, [6, 8]]
-
fntools.
replace
(x, old, new, fn=<built-in function eq>)[source]¶ Replace x with new if fn(x, old) is True.
Parameters: - x – Any value
- old – The old value we want to replace
- new – The value replacing old
- fn – The predicate function determining the relation between x and old. By default fn is the equality function.
Returns: x or new
>>> map(lambda x: replace(x, None, -1), [None, 1, 2, None]) [-1, 1, 2, -1]
-
fntools.
compose
(*fns)[source]¶ Return the function composed with the given functions
Parameters: fns – functions Returns: a function >>> add2 = lambda x: x+2 >>> mult3 = lambda x: x*3 >>> new_fn = compose(add2, mult3) >>> new_fn(2) 8
Note
compose(fn1, fn2, fn3) is the same as fn1(fn2(fn3)) which means that the last function provided is the first to be applied.
-
fntools.
groupby
(fn, coll)[source]¶ Group elements in sub-collections by fn
Parameters: - fn – a function
- coll – a collection
Returns: a dictionary
>>> groupby(len, ['John', 'Terry', 'Eric', 'Graham', 'Mickael']) {4: ['John', 'Eric'], 5: ['Terry'], 6: ['Graham'], 7: ['Mickael']}
-
fntools.
reductions
(fn, seq, acc=None)[source]¶ Return the intermediate values of a reduction
Parameters: - fn – a function
- seq – a sequence
- acc – the accumulator
Returns: a list
>>> reductions(lambda x, y: x + y, [1, 2, 3]) [1, 3, 6]
>>> reductions(lambda x, y: x + y, [1, 2, 3], 10) [11, 13, 16]
-
fntools.
split
(coll, factor)[source]¶ Split a collection by using a factor
Parameters: - coll – a collection
- factor – a collection of factors
Returns: a dictionary
>>> bands = ('Led Zeppelin', 'Debussy', 'Metallica', 'Iron Maiden', 'Bach') >>> styles = ('rock', 'classic', 'rock', 'rock', 'classic') >>> split(bands, styles) {'classic': ['Debussy', 'Bach'], 'rock': ['Led Zeppelin', 'Metallica', 'Iron Maiden']}
-
fntools.
assoc
(_d, key, value)[source]¶ Associate a key with a value in a dictionary
Parameters: - _d – a dictionary
- key – a key in the dictionary
- value – a value for the key
Returns: a new dictionary
>>> data = {} >>> new_data = assoc(data, 'name', 'Holy Grail') >>> new_data {'name': 'Holy Grail'} >>> data {}
Note
the original dictionary is not modified
-
fntools.
dispatch
(data, fns)[source]¶ Apply the functions on the data
Parameters: - data – the data
- fns – a list of functions
Returns: a collection
>>> x = (1, 42, 5, 79) >>> dispatch(x, (min, max)) [1, 79]
-
fntools.
multimap
(fn, colls)[source]¶ Apply a function on multiple collections
Parameters: - fn – a function
- colls – collections
Returns: a collection
>>> multimap(operator.add, ((1, 2, 3), (4, 5, 6))) [5, 7, 9]
>>> f = lambda x, y, z: 2*x + 3*y - z >>> result = multimap(f, ((1, 2), (4, 1), (1, 1))) >>> result[0] == f(1, 4, 1) True >>> result[1] == f(2, 1, 1) True
-
fntools.
multistarmap
(fn, *colls)[source]¶ Apply a function on multiple collections
Parameters: - fn – a function
- colls – collections
Returns: a collection
>>> multistarmap(operator.add, (1, 2, 3), (4, 5, 6)) [5, 7, 9]
>>> f = lambda x, y, z: 2*x + 3*y - z >>> result = multistarmap(f, (1, 2), (4, 1), (1, 1)) >>> result[0] == f(1, 4, 1) True >>> result[1] == f(2, 1, 1) True
-
fntools.
pipe
(data, *fns)[source]¶ Apply functions recursively on your data
Parameters: - data – the data
- fns – functions
Returns: an object
>>> inc = lambda x: x + 1 >>> pipe(42, inc, str) '43'
-
fntools.
pipe_each
(coll, *fns)[source]¶ Apply functions recursively on your collection of data
Parameters: - coll – a collection
- fns – functions
Returns: a list
>>> inc = lambda x: x + 1 >>> pipe_each([0, 1, 1, 2, 3, 5], inc, str) ['1', '2', '2', '3', '4', '6']
-
fntools.
shift
(func, *args, **kwargs)[source]¶ This function is basically a beefed up lambda x: func(x, *args, **kwargs)
shift()
comes in handy when it is used in a pipeline with a function that needs the passed value as its first argument.Parameters: - func – a function
- args – objects
- kwargs – keywords
>>> def div(x, y): return float(x) / y
This is equivalent to div(42, 2):
>>> shift(div, 2)(42) 21.0
which is different from div(2, 42):
>>> from functools import partial >>> partial(div, 2)(42) 0.047619047619047616
-
fntools.
repeatedly
(func)[source]¶ Repeat a function taking no argument
Parameters: func – a function Returns: a generator >>> import random as rd >>> rd.seed(123) >>> take(3, repeatedly(rd.random)) [0.052363598850944326, 0.08718667752263232, 0.4072417636703983]
-
fntools.
update
(records, column, values)[source]¶ Update the column of records
Parameters: - records – a list of dictionaries
- column – a string
- values – an iterable or a function
Returns: new records with the columns updated
>>> movies = [ ... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6}, ... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6}, ... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6} ... ] >>> new_movies = update(movies, 'budget', lambda x: 2*x) >>> [new_movies[i]['budget'] for i,_ in enumerate(movies)] [800000.0, 8000000.0, 18000000.0]
>>> new_movies2 = update(movies, 'budget', (40, 400, 900)) >>> [new_movies2[i]['budget'] for i,_ in enumerate(movies)] [40, 400, 900]
-
fntools.
use
(data, attrs)[source]¶ Return the values of the attributes for the given data
Parameters: - data – the data
- attrs – strings
Returns: a list
With a dict:
>>> band = {'name': 'Metallica', 'singer': 'James Hetfield', 'guitarist': 'Kirk Hammet'} >>> use(band, ('name', 'date', 'singer')) ['Metallica', None, 'James Hetfield']
With a non dict data structure:
>>> from collections import namedtuple >>> Person = namedtuple('Person', ('name', 'age', 'gender')) >>> alice = Person('Alice', 30, 'F') >>> use(alice, ('name', 'gender')) ['Alice', 'F']
-
fntools.
get_in
(record, *keys, **kwargs)[source]¶ Return the value corresponding to the keys in a nested record
Parameters: - record – a dictionary
- keys – strings
- kwargs – keywords
Returns: the value for the keys
>>> d = {'id': {'name': 'Lancelot', 'actor': 'John Cleese', 'color': 'blue'}} >>> get_in(d, 'id', 'name') 'Lancelot'
>>> get_in(d, 'id', 'age', default='?') '?'
-
fntools.
valueof
(records, key)[source]¶ Extract the value corresponding to the given key in all the dictionaries
>>> bands = [{'name': 'Led Zeppelin', 'singer': 'Robert Plant', 'guitarist': 'Jimmy Page'}, ... {'name': 'Metallica', 'singer': 'James Hetfield', 'guitarist': 'Kirk Hammet'}] >>> valueof(bands, 'singer') ['Robert Plant', 'James Hetfield']
Filtering¶
-
fntools.
duplicates
(coll)[source]¶ Return the duplicated items in the given collection
Parameters: coll – a collection Returns: a list of the duplicated items in the collection >>> duplicates([1, 1, 2, 3, 3, 4, 1, 1]) [1, 3]
-
fntools.
pluck
(record, *keys, **kwargs)[source]¶ Return the record with the selected keys
Parameters: - record – a list of dictionaries
- keys – some keys from the record
- kwargs – keywords determining how to deal with the keys
>>> d = {'name': 'Lancelot', 'actor': 'John Cleese', 'color': 'blue'} >>> pluck(d, 'name', 'color') {'color': 'blue', 'name': 'Lancelot'}
The keyword ‘default’ allows to replace a
None
value:>>> d = {'year': 2014, 'movie': 'Bilbo'} >>> pluck(d, 'year', 'movie', 'nb_aliens', default=0) {'movie': 'Bilbo', 'nb_aliens': 0, 'year': 2014}
-
fntools.
pluck_each
(records, columns)[source]¶ Return the records with the selected columns
Parameters: - records – a list of dictionaries
- columns – a list or a tuple
Returns: a list of dictionaries with the selected columns
>>> movies = [ ... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6}, ... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6}, ... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6} ... ] >>> pluck_each(movies, ('title', 'year')) [{'year': 1975, 'title': 'The Holy Grail'}, {'year': 1979, 'title': 'Life of Brian'}, {'year': 1983, 'title': 'The Meaning of Life'}]
-
fntools.
take
(n, seq)[source]¶ Return the n first items in the sequence
Parameters: - n – an integer
- seq – a sequence
Returns: a list
>>> take(3, xrange(10000)) [0, 1, 2]
-
fntools.
drop
(n, seq)[source]¶ Return the n last items in the sequence
Parameters: - n – an integer
- seq – a sequence
Returns: a list
>>> drop(9997, xrange(10000)) [9997, 9998, 9999]
-
fntools.
find
(fn, record)[source]¶ Apply a function on the record and return the corresponding new record
Parameters: - fn – a function
- record – a dictionary
Returns: a dictionary
>>> find(max, {'Terry': 30, 'Graham': 35, 'John': 27}) {'Graham': 35}
-
fntools.
find_each
(fn, records)[source]¶ Apply a function on the records and return the corresponding new record
Parameters: - fn – a function
- records – a collection of dictionaries
Returns: new records
>>> grades = [{'math': 13, 'biology': 17, 'chemistry': 18}, ... {'math': 15, 'biology': 12, 'chemistry': 13}, ... {'math': 16, 'biology': 17, 'chemistry': 11}] >>> find_each(max, grades) [{'chemistry': 18}, {'math': 15}, {'biology': 17}]
Inspection¶
-
fntools.
isiterable
(coll)[source]¶ Return True if the collection is any iterable except a string
Parameters: coll – a collection Returns: a boolean >>> isiterable(1) False >>> isiterable('iterable') False >>> isiterable([1, 2, 3]) True
-
fntools.
are_in
(items, collection)[source]¶ Return True for each item in the collection
Parameters: - items – a sub-collection
- collection – a collection
Returns: a list of booleans
>>> are_in(['Terry', 'James'], ['Terry', 'John', 'Eric']) [True, False]
-
fntools.
any_in
(items, collection)[source]¶ Return True if any of the items are in the collection
Parameters: - items – items that may be in the collection
- collection – a collection
Returns: a boolean
>>> any_in(2, [1, 3, 2]) True >>> any_in([1, 2], [1, 3, 2]) True >>> any_in([1, 2], [1, 3]) True
-
fntools.
all_in
(items, collection)[source]¶ Return True if all of the items are in the collection
Parameters: - items – items that may be in the collection
- collection – a collection
Returns: a boolean
>>> all_in(2, [1, 3, 2]) True >>> all_in([1, 2], [1, 3, 2]) True >>> all_in([1, 2], [1, 3]) False
-
fntools.
monotony
(seq)[source]¶ Determine the monotony of a sequence
Parameters: seq – a sequence Returns: 1 if the sequence is sorted (increasing) Returns: 0 if it is not sorted Returns: -1 if it is sorted in reverse order (decreasing) >>> monotony([1, 2, 3]) 1 >>> monotony([1, 3, 2]) 0 >>> monotony([3, 2, 1]) -1
-
fntools.
occurrences
(coll, value=None, **options)[source]¶ Return the occurrences of the elements in the collection
Parameters: - coll – a collection
- value – a value in the collection
- options – an optional keyword used as a criterion to filter the values in the collection
Returns: the frequency of the values in the collection as a dictionary
>>> occurrences((1, 1, 2, 3)) {1: 2, 2: 1, 3: 1} >>> occurrences((1, 1, 2, 3), 1) 2
Filter the values of the occurrences that are <, <=, >, >=, == or != than a given number:
>>> occurrences((1, 1, 2, 3), lt=3) {1: 2, 2: 1, 3: 1} >>> occurrences((1, 1, 2, 3), gt=1) {1: 2} >>> occurrences((1, 1, 2, 3), ne=1) {1: 2}
-
fntools.
attributes
(data)[source]¶ Return all the non callable and non special attributes of the input data
Parameters: data – an object Returns: a list >>> class table: ... def __init__(self, name, rows, cols): ... self.name = name ... self.rows = rows ... self.cols = cols
>>> t = table('people', 100, 3) >>> attributes(t) ['cols', 'name', 'rows']
-
fntools.
indexof
(coll, item, start=0, default=None)[source]¶ Return the index of the item in the collection
Parameters: - coll – iterable
- item – scalar
- start – (optional) The start index
Default: The default value of the index if the item is not in the collection
Returns: idx – The index of the item in the collection
>>> monties = ['Eric', 'John', 'Terry', 'Terry', 'Graham', 'Mickael'] >>> indexof(monties, 'Terry') 2
>>> indexof(monties, 'Terry', start=3) 3
>>> indexof(monties, 'Terry', start=4) is None True
-
fntools.
indexesof
(coll, item)[source]¶ Return all the indexes of the item in the collection
Parameters: - coll – the collection
- item – a value
Returns: a list of indexes
>>> monties = ['Eric', 'John', 'Terry', 'Terry', 'Graham', 'Mickael'] >>> indexesof(monties, 'Terry') [2, 3]
-
fntools.
count
(fn, coll)[source]¶ Return the count of True values returned by the predicate function applied to the collection
Parameters: - fn – a predicate function
- coll – a collection
Returns: an integer
>>> count(lambda x: x % 2 == 0, [11, 22, 31, 24, 15]) 2
-
fntools.
isdistinct
(coll)[source]¶ Return True if all the items in the collections are distinct.
Parameters: coll – a collection Returns: a boolean >>> isdistinct([1, 2, 3]) True >>> isdistinct([1, 2, 2]) False
-
fntools.
nrow
(records)[source]¶ Return the number of rows in the records
Parameters: records – a list of dictionaries Returns: an integer >>> movies = [ ... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6}, ... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6}, ... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6} ... ] >>> nrow(movies) 3
-
fntools.
ncol
(records)[source]¶ Return the number of columns in the records
Parameters: records – a list of dictionaries Returns: an integer >>> movies = [ ... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6}, ... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6}, ... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6} ... ] >>> ncol(movies) 4
-
fntools.
names
(records)[source]¶ Return the column names of the records
Parameters: records – a list of dictionaries Returns: a list of strings >>> movies = [ ... {'title': 'The Holy Grail', 'year': 1975, 'budget': 4E5, 'total_gross': 5E6}, ... {'title': 'Life of Brian', 'year': 1979, 'budget': 4E6, 'total_gross': 20E6}, ... {'title': 'The Meaning of Life', 'year': 1983, 'budget': 9E6, 'total_gross': 14.9E6} ... ] >>> names(movies) ['total_gross', 'year', 'budget', 'title']