Bags

Combi provides 4 bag classes for you to choose from according to your needs:

  • Bag - The simplest bag class
  • FrozenBag - An immutable (and thus hashable) bag class
  • OrderedBag - A bag class where items are ordered by insertion order
  • FrozenOrderedBag - An immutable, ordered bag class

Bag

class Bag(iterable={})

The Bag class is an implementation of the mathematical concept of a multiset; meaning something like a set, except that every item could appear multiple times, and crucially, we only save the count of the item in memory instead of saving multiple copies of the same item, which would be a waste of memory.

You may know the collections.Counter class from Python’s standard library; the bag classes provided by Combi are very similar, except that they are more strictly defined as multisets, meaning that counts must be positive integers. (Zeros are allowed in input but they are weeded out.) By contrast, collections.Counter allows any value as an item’s count.

This restriction makes the bag classes more powerful than collections.Counter because it allows more methods to be defined. More arithmetical operations are defined, comparison between bags is supported, and more.

When you create a bag, it will be populated with the iterable argument. If iterable is a plain sequence, its items will be added one-by-one:

>>> Bag('abracadabra')
Bag({'c': 1, 'b': 2, 'd': 1, 'a': 5, 'r': 2})

If iterable is a mapping, its values will be taken as item counts:

>>> Bag({'meow': 2, 'woof': 5,})
Bag({'meow': 2, 'woof': 5})

Bag can be accessed similarly to a dict or Counter:

>>> my_bag = Bag('abracadabra')
>>> my_bag['b']
2
>>> 'x' in my_bag
False
>>> my_bag['x'] = 7
>>> my_bag
Bag({'r': 2, 'x': 7, 'b': 2, 'a': 5, 'd': 1, 'c': 1})
>>> 'x' in my_bag
True
>>> my_bag['y'] # If it's not in the bag, the count is zero:
0
>>> for key, count in my_bag.items():
...     print((key, count))
...
('r', 2)
('x', 7)
('b', 2)
('a', 5)
('d', 1)
('c', 1)
elements

An iterator over the elements in the bag. If an item has a count of 7 in the bag, it will repeat 7 times in bag.elements. Example:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> for element in bag.elements:
...     print(element)
a
c
c
c
b
b

If you’re using OrderedBag or FrozenOrderedBag, the items will be yielded by their canonical order (usually insertion order), otherwise they’ll be yielded in arbitrary order, just like dicts and sets.

n_elements

The number of elements in the bag, i.e. the sum of all the counts. Example:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> bag.n_elements
6
clear()

Clear the bag, making it empty.

copy()

Create a shallow copy of the bag.

get(key, default=None)

Get a key’s count with a default fallback, just like dict.get().

keys()

Get an iterator over the keys in a bag:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.keys())
('b', 'c', 'a')
values()

Get an iterator over the counts in a bag:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.values())
(2, 3, 1)
items()

Get an iterator over the items in a bag, i.e. keys with counts:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> tuple(bag.items())
(('b', 2), ('c', 3), ('a', 1))
most_common([n])

Get a tuple of the n most common elements and their counts from the most common to the least. If n is not specified, most_common() returns all elements in the bag. Elements with equal counts are ordered arbitrarily:

>>> Bag('abracadabra').most_common(3)
(('a', 5), ('r', 2), ('b', 2))
pop(key[, default])

Get the count of key and remove it from the bag, optionally with a default fallback.

popitem()

Remove a key from the bag, and get a tuple (key, count).

setdefault(key, default=None)

Get the count of key, optionally with a default fallback. If key is missing, its count in the bag will be set to the default.

update(mapping)

Update the bag with a mapping of key to count.

get_contained_bags()

Get all bags that are subsets of this bag.

This means all bags that have counts identical or smaller for each key.

Example:

>>> bag = Bag({'a': 1, 'b': 2, 'c': 3,})
>>> contained_bags = bag.get_contained_bags()
>>> len(contained_bags)
24
>>> contained_bags[7]
Bag({'c': 2, 'b': 1})
>>> contained_bags[7] < bag # Contained bag is contained.
True

FrozenBag

class FrozenBag(iterable={})

FrozenBag is a multiset just like Bag, except it’s immutable. After it’s created, it can’t be modified. It’s a subclass over Bag.

In which cases would you want to use FrozenBag rather than Bag?

  • If you want to have your bag as a key in a set, dict, or any other kind of hashtable. (A normal bag can’t be used for this because it’s mutable and thus not hashable.)
  • If you want to make it clear in your program that a certain bag should never be changed after it’s created.
get_mutable()

Get a mutable version of this frozen bag. Example:

>>> frozen_bag = FrozenBag('abracadabra')
>>> frozen_bag
FrozenBag({'d': 1, 'c': 1, 'b': 2, 'a': 5, 'r': 2})
>>> frozen_bag.get_mutable()
Bag({'d': 1, 'c': 1, 'b': 2, 'a': 5, 'r': 2})

OrderedBag

class OrderedBag(iterable={})

OrderedBag is a multiset just like Bag, except it’s also ordered. Items have a defined order, which is by default the order in which they were inserted. OrderedBag is a subclass over Bag.

Another way to think of it is that OrderedBag is to Bag what collections.OrderedDict is to dict.

Example:

>>> ordered_bag = OrderedBag('abbcccdddd')
>>> tuple(ordered_bag.elements) # Items ordered by insertion order:
('a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd')
>>> tuple(ordered_bag.items())
(('a', 1), ('b', 2), ('c', 3), ('d', 4))
>>> ordered_bag.index('b')
1
reversed

Get a version of this ordered bag with order reversed. Example:

>>> ordered_bag = OrderedBag('abbcccdddd')
>>> ordered_bag
OrderedBag(OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)]))
>>> ordered_bag.reversed
OrderedBag(OrderedDict([('d', 4), ('c', 3), ('b', 2), ('a', 1)]))

This does not change the existing bag, but creates a new one.

index(key)

Get the index number (i.e. position) of key in the ordered bag.

move_to_end(key, last=True)

Move a key to the end of the order. Specify last=False to move it to the start instead.

sort(key=None, reverse=False)

Sort the keys, changing the order in-place.

The optional key argument, (not to be confused with the bag keys,) is a key function. If it’s not passed in, default Python ordering will be used.

If reverse=True is used, the keys will be sorted in reverse order.

FrozenOrderedBag

class FrozenOrderedBag(iterable={})

FrozenOrderedBag is a multiset just like Bag, except it’s immutable and ordered. You can think of it as a combination of FrozenBag and OrderedBag.

Comparisons between bags

One of the advantages of Bag over collections.Counter is that Bag provides comparison methods between bags, that act similarly to comparisons between Python’s built-in sets. This makes it easy to see whether one bag is contained by another. Example:

>>> sandwich = Bag({'bread': 2, 'cheese': 1, 'tomato': 2, 'burger': 1,})
>>> vegan_sandwich = Bag({'bread': 2, 'tomato': 2,})
>>> vegan_sandwich < sandwich
True
>>> salad = Bag({'tomato': 2, 'bell pepper': 1,})
>>> salad < sandwich # False because there's no bell pepper in our sandwich
False

A bag is smaller-or-equal to another bag if it’s “contained” in it, meaning that every key in the contained bag also exists in the containing bag, and with a count that’s bigger or equal to its count in the contained bag.

A bag is strictly smaller than another bag if the above holds, and there’s at least one key for which the count in the containing bag is strictly bigger than its count in the contained bag.

Please note that unlike most comparisons in Python, this is a partial order rather than a total one, meaning that for some pairs of bags, neither x >= y nor y >= x holds true. This is similar to set comparison in Python.

Arithmetic operations on bags

Another advantage of Bag over collections.Counter is that Bag provides a wealth of arithmetic operations (addition, subtraction, etc.) between bags, and between bags and integers.

The basic arithmetic operations do what you expect them to, operating on the counts of items:

>>> sandwich = Bag({'bread': 2, 'cheese': 1, 'tomato': 2, 'burger': 1,})
>>> salad = Bag({'tomato': 2, 'bell pepper': 1,})
>>> single_tomato = Bag({'tomato': 1,})
>>> sandwich + single_tomato # Addition
Bag({'cheese': 1, 'bread': 2, 'tomato': 3, 'burger': 1})
>>> sandwich - single_tomato # Subtraction
Bag({'cheese': 1, 'bread': 2, 'tomato': 1, 'burger': 1})
>>> sandwich * 3 # Multiplication
Bag({'cheese': 3, 'bread': 6, 'tomato': 6, 'burger': 3})

As for division: You can divide one bag by another to get an integer saying how many times the second bag would go into the first. You can also divide a bag by an integer, which will divide all the counts by that integer, rounding down. All bag division is done with the floor-division operator //, because it’s always rounded-down. Examples:

>>> sandwich // single_tomato # Floor-division by another bag
2
>>> sandwich // 2 # Floor-division by integer
Bag({'bread': 1, 'tomato': 1})

The more advanced operations are also supported:

>>> sandwich % salad # Modulo by bag
Bag({'bread': 2, 'cheese': 1, 'burger': 1, 'tomato': 2})
>>> divmod(sandwich, salad) # Divmod by bag
(0, Bag({'tomato': 2, 'cheese': 1, 'bread': 2, 'burger': 1}))
>>> salad % 2 # Modulo by integer
Bag({'bell pepper': 1})
>>> divmod(salad, 2) # Divmod by integer
(Bag({'tomato': 1}), Bag({'bell pepper': 1}))
>>> salad ** 3 # Exponentiation
Bag({'tomato': 8, 'bell pepper': 1})
>>> pow(salad, 3, 5) # Exponentiation with modulo
Bag({'tomato': 3, 'bell pepper': 1})

And... Did someone say logical operations? Like in Python sets, these do union and intersection:

>>> sandwich | salad # Logical or, a.k.a. set union
Bag({'bread': 2, 'cheese': 1, 'burger': 1, 'tomato': 2, 'bell pepper': 1})
>>> sandwich & salad # Logical and, a.k.a. set intersection
Bag({'tomato': 2})

As a final note about arithmetic operations, augmented assignment is supported for all operations, so you can elegantly mutate bags in-place like this:

>>> sandwich += Bag({'bacon': 2, 'egg': 1,})
>>> sandwich
Bag({'cheese': 1, 'bacon': 2, 'bread': 2, 'egg': 1, 'burger': 1, 'tomato': 2})
>>> sandwich **= 2
>>> sandwich
Bag({'cheese': 1, 'bacon': 4, 'bread': 4, 'egg': 1, 'burger': 1, 'tomato': 4})

Table Of Contents

Previous topic

CombSpace and Comb

Next topic

Other classes

This Page