“This sort of thing is my bag, baby!”
—Austin Powers
Combi provides 4 bag classes for you to choose from according to your needs:
Bag
- The simplest bag classFrozenBag
- An immutable (and thus hashable) bag classOrderedBag
- A bag class where items are ordered by insertion orderFrozenOrderedBag
- An immutable, ordered bag classBag
¶Bag
(iterable={})¶The Bag
class is an implementation of the mathematical concept of a
multiset; meaning something like a set, except that every item could
appear multiple times, and crucially, we only save the count of the item
in memory instead of saving multiple copies of the same item, which would be
a waste of memory.
You may know the collections.Counter
class from Python’s standard
library; the bag classes provided by Combi are very similar, except that
they are more strictly defined as multisets, meaning that counts must be
positive integers. (Zeros are allowed in input but they are weeded out.) By
contrast, collections.Counter
allows any value as an item’s count.
This restriction makes the bag classes more powerful than
collections.Counter
because it allows more methods to be defined.
More arithmetical operations are defined, comparison between bags is
supported, and more.
When you create a bag, it will be populated with the iterable
argument.
If iterable
is a plain sequence, its items will be added one-by-one:
>>> Bag('abracadabra')
Bag({'c': 1, 'b': 2, 'd': 1, 'a': 5, 'r': 2})
If iterable
is a mapping, its values will be taken as item counts:
>>> Bag({'meow': 2, 'woof': 5,})
Bag({'meow': 2, 'woof': 5})
Bag
can be accessed similarly to a dict
or Counter
:
>>> my_bag = Bag('abracadabra')
>>> my_bag['b']
2
>>> 'x' in my_bag
False
>>> my_bag['x'] = 7
>>> my_bag
Bag({'r': 2, 'x': 7, 'b': 2, 'a': 5, 'd': 1, 'c': 1})
>>> 'x' in my_bag
True
>>> my_bag['y'] # If it's not in the bag, the count is zero:
0
>>> for key, count in my_bag.items():
... print((key, count))
...
('r', 2)
('x', 7)
('b', 2)
('a', 5)
('d', 1)
('c', 1)
elements
¶An iterator over the elements in the bag. If an item has a count of 7 in
the bag, it will repeat 7 times in bag.elements
. Example:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> for element in bag.elements:
... print(element)
a
c
c
c
b
b
If you’re using OrderedBag
or
FrozenOrderedBag
, the items will be
yielded by their canonical order (usually insertion order), otherwise
they’ll be yielded in arbitrary order, just like dicts and sets.
n_elements
¶The number of elements in the bag, i.e. the sum of all the counts. Example:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> bag.n_elements
6
clear
()¶Clear the bag, making it empty.
copy
()¶Create a shallow copy of the bag.
get
(key, default=None)¶Get a key’s count with a default fallback, just like dict.get()
.
keys
()¶Get an iterator over the keys in a bag:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.keys())
('b', 'c', 'a')
values
()¶Get an iterator over the counts in a bag:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.values())
(2, 3, 1)
items
()¶Get an iterator over the items in a bag, i.e. keys with counts:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.items())
(('b', 2), ('c', 3), ('a', 1))
most_common
([n])¶Get a tuple of the n
most common elements and their counts from the
most common to the least. If n is not specified, most_common()
returns all elements in the bag. Elements with equal counts are ordered
arbitrarily:
>>> Bag('abracadabra').most_common(3)
(('a', 5), ('r', 2), ('b', 2))
pop
(key[, default])¶Get the count of key
and remove it from the bag, optionally with a
default fallback.
popitem
()¶Remove a key from the bag, and get a tuple (key, count)
.
setdefault
(key, default=None)¶Get the count of key, optionally with a default fallback. If key is missing, its count in the bag will be set to the default.
update
(mapping)¶Update the bag with a mapping of key to count.
get_contained_bags
()¶Get all bags that are subsets of this bag.
This means all bags that have counts identical or smaller for each key.
Example:
>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> contained_bags = bag.get_contained_bags()
>>> len(contained_bags)
24
>>> contained_bags[7]
Bag({'c': 2, 'b': 1})
>>> contained_bags[7] < bag # Contained bag is contained.
True
FrozenBag
¶FrozenBag
(iterable={})¶FrozenBag
is a multiset just like Bag
, except it’s
immutable. After it’s created, it can’t be modified. It’s a subclass over
Bag
.
In which cases would you want to use FrozenBag
rather than
Bag
?
set
, dict
,
or any other kind of hashtable. (A normal bag can’t be used for this
because it’s mutable and thus not hashable.)get_mutable
()¶Get a mutable version of this frozen bag. Example:
>>> frozen_bag = FrozenBag('abracadabra')
>>> frozen_bag
FrozenBag({'d': 1, 'c': 1, 'b': 2, 'a': 5, 'r': 2})
>>> frozen_bag.get_mutable()
Bag({'d': 1, 'c': 1, 'b': 2, 'a': 5, 'r': 2})
OrderedBag
¶OrderedBag
(iterable={})¶OrderedBag
is a multiset just like Bag
, except it’s also
ordered. Items have a defined order, which is by default the order in which
they were inserted. OrderedBag
is a subclass over Bag
.
Another way to think of it is that OrderedBag
is to Bag
what collections.OrderedDict
is to dict
.
Example:
>>> ordered_bag = OrderedBag('abbcccdddd')
>>> tuple(ordered_bag.elements) # Items ordered by insertion order:
('a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd')
>>> tuple(ordered_bag.items())
(('a', 1), ('b', 2), ('c', 3), ('d', 4))
>>> ordered_bag.index('b')
1
reversed
¶Get a version of this ordered bag with order reversed. Example:
>>> ordered_bag = OrderedBag('abbcccdddd')
>>> ordered_bag
OrderedBag(OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)]))
>>> ordered_bag.reversed
OrderedBag(OrderedDict([('d', 4), ('c', 3), ('b', 2), ('a', 1)]))
This does not change the existing bag, but creates a new one.
index
(key)¶Get the index number (i.e. position) of key
in the ordered bag.
move_to_end
(key, last=True)¶Move a key to the end of the order. Specify last=False to move it to the start instead.
sort
(key=None, reverse=False)¶Sort the keys, changing the order in-place.
The optional key
argument, (not to be confused with the bag keys,) is
a key function. If it’s not passed in, default Python ordering will be
used.
If reverse=True
is used, the keys will be sorted in reverse order.
FrozenOrderedBag
¶FrozenOrderedBag
(iterable={})¶FrozenOrderedBag
is a multiset just like Bag
, except it’s
immutable and ordered. You can think of it as a combination of
FrozenBag
and OrderedBag
.
One of the advantages of Bag
over collections.Counter
is that
Bag
provides comparison methods between bags, that act similarly to
comparisons between Python’s built-in sets. This makes it easy to see whether
one bag is contained by another. Example:
>>> sandwich = Bag({'bread': 2, 'cheese': 1, 'tomato': 2, 'burger': 1,})
>>> vegan_sandwich = Bag({'bread': 2, 'tomato': 2,})
>>> vegan_sandwich < sandwich
True
>>> salad = Bag({'tomato': 2, 'bell pepper': 1,})
>>> salad < sandwich # False because there's no bell pepper in our sandwich
False
A bag is smaller-or-equal to another bag if it’s “contained” in it, meaning that every key in the contained bag also exists in the containing bag, and with a count that’s bigger or equal to its count in the contained bag.
A bag is strictly smaller than another bag if the above holds, and there’s at least one key for which the count in the containing bag is strictly bigger than its count in the contained bag.
Please note that unlike most comparisons in Python, this is a partial order
rather than a total one, meaning that for some pairs of bags, neither x >=
y
nor y >= x
holds true. This is similar to set comparison in Python.
Another advantage of Bag
over collections.Counter
is that
Bag
provides a wealth of arithmetic operations (addition,
subtraction, etc.) between bags, and between bags and integers.
The basic arithmetic operations do what you expect them to, operating on the counts of items:
>>> sandwich = Bag({'bread': 2, 'cheese': 1, 'tomato': 2, 'burger': 1,})
>>> salad = Bag({'tomato': 2, 'bell pepper': 1,})
>>> single_tomato = Bag({'tomato': 1,})
>>> sandwich + single_tomato # Addition
Bag({'cheese': 1, 'bread': 2, 'tomato': 3, 'burger': 1})
>>> sandwich - single_tomato # Subtraction
Bag({'cheese': 1, 'bread': 2, 'tomato': 1, 'burger': 1})
>>> sandwich * 3 # Multiplication
Bag({'cheese': 3, 'bread': 6, 'tomato': 6, 'burger': 3})
As for division: You can divide one bag by another to get an integer saying how
many times the second bag would go into the first. You can also divide a bag by
an integer, which will divide all the counts by that integer, rounding down.
All bag division is done with the floor-division operator //
, because it’s
always rounded-down. Examples:
>>> sandwich // single_tomato # Floor-division by another bag
2
>>> sandwich // 2 # Floor-division by integer
Bag({'bread': 1, 'tomato': 1})
The more advanced operations are also supported:
>>> sandwich % salad # Modulo by bag
Bag({'bread': 2, 'cheese': 1, 'burger': 1, 'tomato': 2})
>>> divmod(sandwich, salad) # Divmod by bag
(0, Bag({'tomato': 2, 'cheese': 1, 'bread': 2, 'burger': 1}))
>>> salad % 2 # Modulo by integer
Bag({'bell pepper': 1})
>>> divmod(salad, 2) # Divmod by integer
(Bag({'tomato': 1}), Bag({'bell pepper': 1}))
>>> salad ** 3 # Exponentiation
Bag({'tomato': 8, 'bell pepper': 1})
>>> pow(salad, 3, 5) # Exponentiation with modulo
Bag({'tomato': 3, 'bell pepper': 1})
And... Did someone say logical operations? Like in Python sets, these do union and intersection:
>>> sandwich | salad # Logical or, a.k.a. set union
Bag({'bread': 2, 'cheese': 1, 'burger': 1, 'tomato': 2, 'bell pepper': 1})
>>> sandwich & salad # Logical and, a.k.a. set intersection
Bag({'tomato': 2})
As a final note about arithmetic operations, augmented assignment is supported for all operations, so you can elegantly mutate bags in-place like this:
>>> sandwich += Bag({'bacon': 2, 'egg': 1,})
>>> sandwich
Bag({'cheese': 1, 'bacon': 2, 'bread': 2, 'egg': 1, 'burger': 1, 'tomato': 2})
>>> sandwich **= 2
>>> sandwich
Bag({'cheese': 1, 'bacon': 4, 'bread': 4, 'egg': 1, 'burger': 1, 'tomato': 4})