If you are eager to learn some Python and do not know how to start, this post may give you some hints. I will develop a very simple Python package from scratch, exemplifying some Object-oriented Programming (OOP) techniques and concepts, and using a Test-Driven Development (TDD) approach.

The package will provide some classes to deal with binary numbers (see the Rationale section), but remember that it is just a toy project. Nothing in this package has been designed with performance in mind: it wants to be as clear as possible.

## Rationale¶

Binary numbers are rather easy to understand, even if becoming familiar with them requires some time. I expect you to have knowledge of the binary numeral system. If you need to review them just take a look at the Wikipedia entry or one of the countless resources on Internet.

The package we are going to write will provide a class that represents binary numbers (`Binary`) and a class that represents binary numbers with a given bit size (`SizeBinary`). They shall provide basic binary operations like logical (and, or, xor), arithmetic (addition, subtraction, multiplication, division), shifts and indexing.

A quick example of what the package shall do:

```>>> b = Binary('0101110001')
>>> hex(b)
'0x171'
>>> int(b)
369
>>> b
'1'
>>> b
'0'
>>> b.SHR()
'10111000'
```

## Python and bases¶

Binary system is just a representation of numbers with base 2, just like hexadecimal (base 16) and decimal (base 10). Python can already natively deal with different bases, even if internally numbers are always stored as binary integers (more precise as numbers with base 230 and each digit as a binary number). Let us check it

```>>> a = 5
>>> a
5
>>> a = 0x5
>>> a
5
>>> a = 0b101
>>> a
5
>>> hex(0b101)
'0x5'
>>> bin(5)
'0b101'
```

As you can see Python understands some common bases out of the box, using the `0x` prefix for hexadecimal numbers and the `0b` for binary ones (and `0o`) for octals). However the number is always printed in its base-10 form (`5` in this case). This means however that a binary number cannot be indexed, since integers does not provide support for this operation

```>>> 0b101
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not subscriptable
```

You can also use a different base when converting things to integers, through the `base` parameter

```>>> a = int('101', base=2)
>>> a
5
>>> a = int('10', base=5)
>>> a
5
```

## Test-driven development¶

Simple tasks are the best way to try and use new development methodologies, so this is a good occasion to start working with the so-called test-driven approach. Test-driven Development (TDD) basically means that the first thing you do when developing is to write some tests, that is programs that use what you are going to develop. The purpose of those programs is to test that your final product complies with a given behaviour. So they provide

• Documentation for your API: they are examples of use of your package.
• Regression checks: when you change the code to develop new features they shall not break the behaviour of the previous package versions.
• TODO list: until all tests run successfully you have something still waiting to be implemented.

I suggest you to follow this post until we have some tests (section "Writing some tests" included), then write your own class, trying to make it pass all the tests. This way, actually developing something, you can really learn both TDD and Python. Then you can check your code against mine and perhaps provide a far better solution than the one found by me.

Keep in mind, however, that the usual TDD workflow consist in writing 1 test and then adjusting the code to pass it (and all the previous tests). You shouldn't add a bunch of tests in on go, which I will do here just for brevity's sake.

## Development environment¶

Create and activate a Python virtual environment using your favourite tool. You can follow the official documentation or use something more structured like pyenv.

Once the virtual environment is activated, install Pytest with

```\$ pip install pytest
```

Then create a directory for the project and enter it. You are free to give it the name you prefer, and I will assume all the following commands are executed inside that directory.

First of all let's create two directories for code and tests

```\$ mkdir src
\$ touch src/__init__.py
\$ mkdir tests
\$ touch tests/__init__.py
```

Finally, let us check that everything is working correctly. Since there are no tests yet, Pytest should terminate without errors.

```\$ pytest
===================== test session starts =====================
[...]
collected 0 items

=======================  in 0.00 seconds ======================
```

where `[...]` will be filled with information about your execution environment.

## Pytest¶

The approach used by Pytest is very straightforward: to test your library you just have to write some functions that use it. Those functions shall run without raising any exception; if a test (a function) runs without raising an exception it passes, otherwise it fails. Let us start writing a very simple test to learn the basic syntax. Create the file `tests/test_binary.py` and write in it the following code

`tests/test_binary.py`
```def test_first():
pass
```

If you run `pytest` again you shall obtain this result

```\$ pytest
===================== test session starts =====================
[...]
collected 1 items

tests/test_binary.py .

=================== 1 passed in 0.01 seconds ==================
```

if you prefer (as I do) you may use the `-v` verbose switch to get detailed information about what tests have been executed

```\$ pytest -v
===================== test session starts =====================
[...]
collected 1 items

tests/test_binary.py::test_first PASSED

=================== 1 passed in 0.01 seconds ==================
```

By default Pytest looks for Python files whose name starts with `test_`, and this is why it processes our file `tests/test_binary.py`. For each file it runs all functions whose name, again, starts with `test_`, and this is why `test_first()` has been executed.

The latter does nothing, so it runs without raising any exception and the test passes. Let us try to raise an exception

`tests/test_binary.py`
```def test_first():
raise ValueError
```

which gives the following output

```\$ pytest -v
===================== test session starts =====================
[...]
collected 1 items

tests/test_binary.py::test_first FAILED

=========================== FAILURES ==========================
__________________________ test_first _________________________

def test_first():
>       raise ValueError
E       ValueError

tests/test_binary.py:2: ValueError
=================== 1 failed in 0.01 seconds ==================
```

To easily write tests that raise exceptions when failing we may use the `assert` Python statement, which shall be followed by an expression. If the expression returns a true value, `assert` does nothing, otherwise it raises an `AssertionError` exception. Let us do a quick check in the Python console

```>>> assert True
>>> assert False
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
>>>
>>> assert 1 == 1
>>> assert 1 == 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
```

So usually our tests will contain some code and one or more assertions. I prefer to have just one assertion for each test, except perhaps when testing the very same feature more than once (for example getting various elements from a list). This way you are immediately aware of what assertion raised the exception, that is you immediately know what feature does not work as expected.

## Writing some tests¶

So now we will pretend we have already developed our `Binary` class and write some tests that check its behaviour. I will add tests to check several aspects of our code and describe what they do. You can find the full project at https://github.com/TheDigitalCatOnline/python-oop-tdd-example, and I will mention specific commits during the development.

### Initialisation

`tests/test_binary.py`
```from src.binary import Binary

def test_binary_init_int():
binary = Binary(6)
assert int(binary) == 6
```

This is our first real test. First of all we import the class from the file `binary.py` (which doesn't exists yet). The function `test_binary_init_int()` should initialise a `Binary` with an integer. The assertion checks that the newly created variable `binary` has a consistent integer representation, which is the number we used to initialize it.

We want to be able to initialise a `Binary` with a wide range of values: bit strings (`'110'`), binary strings (`'0b110'`), hexadecimal strings (`'0x6'`), hexadecimal values (`0x6`), lists of integers (`[1,1,0]`) and list of strings (`['1','1','0']`). The following tests check all those cases

`tests/test_binary.py`
```def test_binary_init_bitstr():
binary = Binary("110")
assert int(binary) == 6

def test_binary_init_binstr():
binary = Binary("0b110")
assert int(binary) == 6

def test_binary_init_hexstr():
binary = Binary("0x6")
assert int(binary) == 6

def test_binary_init_hex():
binary = Binary(0x6)
assert int(binary) == 6

def test_binary_init_intseq():
binary = Binary([1, 1, 0])
assert int(binary) == 6

def test_binary_init_strseq():
binary = Binary(["1", "1", "0"])
assert int(binary) == 6
```

Finally, let us check that our `Binary` class cannot be initialised with a negative number. I decided to represent negative numbers through the two's complement technique, which however requires a predefined bit length. So for simple binaries I just discard negative numbers.

`tests/test_binary.py`
```import pytest

[...]

def test_binary_init_negative():
with pytest.raises(ValueError):
Binary(-4)
```

As you can see, now we have to check that our class raises an exception, but if we make the class raise it the test will fail. To let the test pass we shall check that the exception is raised but suppress it, and this can be done with `pytest.raises`, which is a suitable context manager.

### Conversions

I want to check that my binary numbers can be correctly converted to integers (through `int()`), binary strings (through `bin()`), hexadecimals (through `hex()`) and to strings (through `str()`). I want the string representation to be a plain sequence of zeros and ones, that is the binary string representation without the `0b` prefix.

Some examples (check the full code for the whole set of tests)

`tests/test_binary.py`
```def test_binary_int():
binary = Binary(6)
assert int(binary) == 6

def test_binary_str():
binary = Binary(6)
assert str(binary) == '110'
```

### Writing the class

Trying to run the tests at this point just returns a big failure due to an import error, since the `binary.py` module does not exists yet.

```\$ pytest -v
===================== test session starts =====================
[...]
collected 0 items / 1 errors
============================ ERRORS ===========================
____________ ERROR collecting tests/test_binary.py ____________
tests/test_binary.py:3: in <module>
from src.binary import Binary
E   ModuleNotFoundError: No module named 'src.binary'
=================== 1 error in 0.01 seconds ===================
```

Let us create the file `binary.py` and start creating the `Binary` class.

`src/binary.py`
```class Binary:
pass
```

Now when you run Pytest the output shows that all tests are found and that all of them fail (I will just show the first one)

```\$ pytest -v
===================== test session starts =====================

[...]

collected 13 items

tests/test_binary.py::test_binary_init_int FAILED       [  7%]
tests/test_binary.py::test_binary_init_negative FAILED  [ 15%]
tests/test_binary.py::test_binary_init_bitstr FAILED    [ 23%]
tests/test_binary.py::test_binary_init_binstr FAILED    [ 30%]
tests/test_binary.py::test_binary_init_hexstr FAILED    [ 38%]
tests/test_binary.py::test_binary_init_hex FAILED       [ 46%]
tests/test_binary.py::test_binary_init_intseq FAILED    [ 53%]
tests/test_binary.py::test_binary_init_strseq FAILED    [ 61%]
tests/test_binary.py::test_binary_eq FAILED             [ 69%]
tests/test_binary.py::test_binary_int FAILED            [ 76%]
tests/test_binary.py::test_binary_bin FAILED            [ 84%]
tests/test_binary.py::test_binary_str FAILED            [ 92%]
tests/test_binary.py::test_binary_hex FAILED            [100%]

============================ FAILURES ==========================
_____________________ test_binary_init_int _____________________

def test_binary_init_int():
>       binary = Binary(6)
E       TypeError: Binary() takes no arguments

tests/test_binary.py:7: TypeError

[...]

=================== 13 failed in 0.03 seconds ==================
```

So now you can start writing code and use your test battery to check if it works as expected. Obviously "as expected" means that all the tests you have pass, but this does not imply you covered all cases. TDD is an iterative methodology: when you find a bug or a missing feature you first write a good test or a set of tests that address the matter and then produce some code that make the tests pass.

At this point you are warmly encouraged to write the code by yourself and to check your product with the given battery of tests. Create the file `tests/test_binary.py` and copy into it one test at a time. Read the test carefully to understand what you need to implement and start writing the class. When you think you are done with a part of it just run the tests and see if everything works well, then move on with the new one

GitHub

## My solution¶

The most complex part of the class is the initialization, since I want it to accept a wide range of data types. Basically we have to deal with sequences (strings and lists) or with plain values. The latter ones shall be convertible to an integer, otherwise trying to initialise a binary number with them makes no sense. Since binary numbers are just a representation of integers I decided to store the value in the class as an integer inside the `self._value` attribute. Pay attention that this decision means that all leading zeros will be stripped from the number, i.e., `Binary('000101')` is equal to `Binary('101')`. This will be important for indexing and slicing.

This is the code

`src/binary.py`
```from collections.abc import Sequence

class Binary:
def __init__(self, value=0):
if isinstance(value, Sequence):
if len(value) > 2 and value[0:2] == "0b":
self._value = int(value, base=2)
elif len(value) > 2 and value[0:2] == "0x":
self._value = int(value, base=16)
else:
self._value = int("".join([str(i) for i in value]), base=2)
else:
try:
self._value = int(value)
if self._value < 0:
raise ValueError("Binary cannot accept negative numbers")
except ValueError:
raise ValueError(f"Cannot convert value {value} to Binary")

def __int__(self):
return self._value
```

Running the tests I get `4 failed, 9 passed in 0.02 seconds`, which is a good score. The tests that still fail are `test_binary_eq`, `test_binary_bin`, `test_binary_str` and `test_binary_hex`. Since I still wrote no code for the conversions those failures were expected.

Let us review the code I wrote. I make use of the module `collections.abc` to check if the input is a sequence of a plain value. If you do not know what Abstract Base Classes are, please check this post and the documentation of the module `collections.abc`.

Basically through `isinstance(value, Sequence)` we check that the incoming value behaves like a sequence, which is different from saying that it is a `list`, a `string`, or other sequences. The first case covers an incoming string in the form `0bXXXXX`, which is converted to an integer through the `int()` function. The second case is the same but for hexadecimal strings in the form `0xXXXXX`.

The third case covers a generic sequence of values that shall be individually convertible to `0` or `1`. The code converts each element of the sequence to a string, joins them in a single string and converts it with base 2. This covers the case of a string of zeros and ones and the case of an iterable of integers, like a list for example.

If the incoming value is not a sequence it shall be convertible to an integer, which is exactly what the `try` part does. Here we also check if the value is negative and raise a suitable exception.

Finally, the `__int__()` method (one of the Python magic methods) is automatically called when we apply `int()` to our binary, just like we do in a lot of the tests. This method is basically the one responsible of providing a conversion to integer of a given class. In this case we just have to return the value we stored internally.

Please note that I didn't write this code in a single burst. I had to run the tests more than once to tune my code.

I already wrote the method that performs the conversion to an integer. Some tests however (namely `test_binary_bin` and `test_binary_hex`) still fail with the error message `TypeError: 'Binary' object cannot be interpreted as an integer`.

According to the official documentation, "If x is not a Python int object, it has to define an `__index__()` method that returns an integer." so this is what we are missing. As for `__index__()`, the documentation states that "In order to have a coherent integer type class, when `__index__()` is defined `__int__()` should also be defined, and both should return the same value."

So we just have to add

`src/binary.py`
```def __index__(self):
return self.__int__()
```

inside the class, and we get two more successful tests.

To make `test_binary_str` pass we have to provide a magic method that converts the object into a string, which is

`src/binary.py`
```def __str__(self):
return bin(self)[2:]
```

It makes use of the internal Python algorithm provided by `bin()` stripping the `0b` prefix.

The last failing test is `test_binary_eq` which tests for equality between two `Binary` objects. Out of the box, Python compares objects on a very low level, just checking if the two references point to the same object in memory. To make it smarter we have to provide the `__eq__()` method

`src/binary.py`
```def __eq__(self, other):
return int(self) == int(other)
```

And now all the tests run successfully.

GitHub

## Binary operations¶

Now it is time to add new features to our `Binary` class. As already said, the TDD methodology wants us to first write the tests, then to write the code. Our class is missing some basic arithmetic and binary operations. Some of the test we can add are

`tests/test_binary.py`
```def test_binary_addition_int():
assert Binary(4) + 1 == Binary(5)

assert Binary(4) + Binary(5) == Binary(9)
```

These check that adding both an integer and a `Binary` to a `Binary` works as expected.

`tests/test_binary.py`
```def test_binary_division_int():
assert Binary(20) / 4 == Binary(5)

def test_binary_division_rem_int():
assert Binary(21) / 4 == Binary(5)
```

These check two different cases because division can produce a remainder, which is not considered here.

`tests/test_binary.py`
```def test_binary_get_bit():
binary = Binary('0101110001')
assert binary == '1'
assert binary == '1'

def test_binary_not():
assert ~Binary('1101') == Binary('10')

def test_binary_and():
assert Binary('1101') & Binary('1') == Binary('1')

def test_binary_shl_pos():
assert Binary('1101') << 5 == Binary('110100000')
```

The function `test_binary_get_bit` tests indexing and is one of the few tests that contain more than one assertion. Please note that binary indexing, unlike standard sequence indexing in Python, starts from the rightmost element.

Bitwise and arithmetic operations are implemented using Python magic methods. Please check the official documentation for a complete list of operators and related methods.

The code that implements the required behaviour is

`src/binary.py`
```    def __and__(self, other):
return Binary(self._value & Binary(other)._value)

def __or__(self, other):
return Binary(self._value | Binary(other)._value)

def __xor__(self, other):
return Binary(self._value ^ Binary(other)._value)

def __lshift__(self, pos):
return Binary(self._value << pos)

def __rshift__(self, pos):
return Binary(self._value >> pos)

return Binary(self._value + Binary(other)._value)

def __sub__(self, other):
return Binary(self._value - Binary(other)._value)

def __mul__(self, other):
return Binary(self._value * Binary(other)._value)

def __truediv__(self, other):
return Binary(int(self._value / Binary(other)._value))

def __invert__(self):
return Binary([abs(int(i) - 1) for i in str(self)])
```

The method `__invert__()` is called when performing a bitwise NOT operation (`~`) and is implemented avoiding negative numbers. A simple solution is to convert the `Binary` into a string and then reverse every digit (`abs(int(i) - 1)` returns 0 for '1' and 1 for '0').

GitHub

## Indexing¶

I want `Binary` to support indexing, just like lists. The difference between lists and the `Binary` type is that for the latter indexes start from the rightmost element.

We can test this in a simple way with

`tests/test_binary.py`
```def test_binary_get_bit():
binary = Binary("0101110001")
assert binary == "1"
assert binary == "1"
```

The implementation of this behaviour is actually very simple , but as we will see later it contains some bugs.

`src/binary.py`
```    def __getitem__(self, item):
return str(self)[-(item + 1)]
```

GitHub

## Slicing¶

Just like lists, `Binary` should support slicing. Always keep in mind the reversed indexing, that can be exemplified by

```>>> b = Binary('01101010')
>>> b[4:7]
<binary.Binary object at 0x...> (110)
>>> b[1:3]
<binary.Binary object at 0x...> (101)
```

This can be directly converted into a test

`tests/test_binary.py`
```def test_binary_slice():
assert Binary('01101010')[0:3] == Binary('10')
assert Binary('01101010')[1:4] == Binary('101')
assert Binary('01101010')[4:] == Binary('110')
```

This shows that my initial implementation of the method `__index__()` was too trivial. The output of this test shows that the method doesn't support slicing, raising the exception `TypeError: unsupported operand type(s) for +: 'slice' and 'int'`. Checking the documentation of `__getitem__()` we notice that it shall manage both integers and `slice` objects (this is the missing part) and that it shall raise `IndexError` for illegal indexes to make for loops work. So I immediately add a test for this rule and other tests to match the correct behaviour

`tests/test_binary.py`
```def test_binary_negative_index():
assert Binary("0101110001")[-1] == "1"
assert Binary("0101110001")[-2] == "0"

def test_binary_illegal_index():
with pytest.raises(IndexError):
Binary("01101010")

def test_binary_inappropriate_type_index():
with pytest.raises(TypeError):
Binary("01101010")["key"]

def test_binary_for_loop():
assert [int(i) for i in Binary("01101010")] == [0, 1, 0, 1, 0, 1, 1]
```

Remember that leading zeros are stripped, so getting the index number 7 shall fail, since the binary number has just 7 digits, even if the incoming string has 8 characters.

I also added a test for empty slices. For those, I arbitrarily decided that an empty slice should return `Binary(0)`.

`tests/test_binary.py`
```def test_empty_slice():
assert Binary("01101010")[4:4] == Binary("0")
```

Instead of trying to reimplement the whole list slicing behaviour with reversed indexes, it is much simpler to make use of it. We can just take the string version of our `Binary`, slice its reversed version and return the result (reversed again). The result of the slice can be a single element, however, so we have to check it against the `Sequence` class

`src/binary.py`
```    def __getitem__(self, key):
reversed_list = [int(i) for i in reversed(str(self))]
sliced = reversed_list.__getitem__(key)
if isinstance(sliced, Sequence):
if len(sliced) > 0:
return Binary([i for i in reversed(sliced)])
else:
return Binary(0)
else:
return Binary(sliced)
```

The first list comprehension returns a list of integers with the reversed version of the binary number (i.e. from `Binary('01101')` to `[1, 0, 1, 1]`, remember that leading zeros are stripped). Then we delegate the slice to the `list` type, calling `__getitem__()`. The result of this call may be a sequence or a single element (integer), so we tell apart the two cases. In the first case we reverse the result again, in the second case we just return it. In both cases we create a `Binary` object. The check on the length of the list must be introduced because the slice may return no elements, but the `Binary` class does not accept an empty list.

GitHub

## Splitting binaries¶

The last feature I want to add is a function `split()` that divides the binary number in two binaries. The rightmost one shall have the given size in bits, while the leftmost just contains the remaining bits. The following tests exemplify the behaviour of `split()`

`tests/test_binary.py`
```def test_binary_split_no_remainder():
assert Binary('110').split(4) == (0, Binary('110'))

def test_binary_split_remainder():
assert Binary('110').split(2) == (1, Binary('10'))

def test_binary_split_exact():
assert Binary('100010110').split(9) == (0, Binary('100010110'))

assert Binary('100010110').split(8) == (1, Binary('10110'))
```

The code that implements it leverages the slicing behaviour

`src/binary.py`
```def split(self, bits):
return (self[bits:], self[:bits])
```

GitHub

## Final words¶

If you tried and write your own class before checking my solution I'm sure you experienced both some frustration when tests failed and a great joy when they finally passed. I'm also sure that you could appreciate the simplicity of TDD and perhaps understand why so many programmes adopt it.

In the next post I will guide you through the addition of the `SizeBinary` class, again following the TDD methodology.

2015-05-15 As suggested by Jacob Zimmerman the class lacks some methods to be a complete numeric class, most notably `__radd__` and `__rsub__`. Indeed, my first goal was to show TDD so I did not add the whole series of reflected arithmetic operations. You will find all those methods here and try to implement them following the methodology shown in the post. Jacob also suggested to shorten the `__str__()` implementation, and I fixed it. Thanks Jacob!
2016-12-20 GitHub user dndln found an error in the 'Binary operations' section. The `test_binary_get_bit()` test included the assertion `assert binary == '0'` which cannot be successful, since leading zeros are stripped, as stated in the previous section 'My Solution'. The attached code files were already correct. Thanks a lot for pointing it out!