Array: Python Implementation

ishant_mrinal · July 16, 2022, 2:31pm

Array is a data structure to store data at contiguous memory locations.

Element in an array is accessed in constant time provided we know the first element memory address and the data-type of the data stored in the array. Only data of a single data-type can be stored in an array. An array can either hold integer or float or string etc; but not a combination of float, integer or integer, string etc.

Array is light on memory requirement, as we only need to store the values, not any identifier or index for the value.

Traversing an array is simple and is O(n) complexity operation.

Array complexity

Time complexity

Worst case

Access	Search	Insertion	Deletion
O(1)	O(n)	O(n)	O(n)

Space Complexity

O(n)

Example array and memory addresses

Following is one graphical demonstration of an array which has 7 elements. Each element of the array is an integer and takes 32 bit of memory to store the value.

The next element of the array can be accessed based on the starting element memory address as follows:

memory_address_of_next_element = memory_address_of_start_element 
 + single_element_memory_size * index

In the above array single_element_memory_size = 32, memory_address_of_start_element=100

Element at index i can be accessed as 100 + 32 * i.
Address of element at index 3 is 100 + 32 * 3=196

Resize

Arrays are of constant size, language like C/C++ does not allow to resize the array dynamically. To increase the size of an array, the user has to create a new array with the desired size and copy the elements from previous array, which is O(n) complexity operation.

Reindexing

When deleting an element from the array, reindexing is required for the elements succeeding the removed element. Complexity of reindexing is again O(n)

Array Implementation in Python

Here we will use Python to implement a simple Array data structure. We will use Python’s ctypes to allocate a block of contiguous memory for our array.

A contiguous block of memory can be allocated as follows:

import ctypes

# allocate memory for 10 unsigned integers
array = (ctypes.c_unit * 10) ()

# allocate memory for 10 floating point numbers
array = (ctypes.c_float * 10) ()

# allocate memory for 10 python objects, exact amount of memory will depend on the object type stored
# if integer is stored then 32 bit per object, if float is stored 96 bit per object etc
array = (ctypes.py_object * 10) ()

Lets verify if the above construct really give consecutive memory addresses:

>>> import ctypes
>>> array = (ctypes.c_uint * 10)()
>>> for i in range(10):
...     array[i] = i+1
... 
>>> array[0]
1
>>> array[9]
10
>>> id(array[0])
4320311488
>>> id(array[1])
4320311520
# Get hexadecimal representation of the memory address
>>> hex(id(array[0]))
'0x10182b8c0'
# The following address is 32 bit far from the address of array[0]
# 0x10182b8e0 - 0x10182b8c0 = 0x20 = 2 * 16 + 0 * 1 = 32
>>> hex(id(array[1]))
'0x10182b8e0'
# The following address is 32 bit far from the address of array[1]
>>> hex(id(array[2]))
'0x10182b900'
# The following address is 32 bit far from the address of array[2]
>>> hex(id(array[3]))
'0x10182b920'

In the above example snippet we use id function to get memory address and hex to convert the address to hexadecimal format for readability.

Here in our implementation we will use ctypes.py_object to allocate memory such that user don’t have to explicitly set the data type while creating the array.

Here we will implement Array with the following features:

Append
Append an element to the array.
Remove
Remove an element at an index of the array
Pop
Remove and returns the last element of the array
Set item __setitem__
Set an element using explicit index
Get item __getitem__
Get an element using explicit index

import ctypes
from typing import Any

class Array:
  """Array data structure implementation in Python.
  
  Args:
    capacity: Maximum capacity of the array.
  """
  def __init__(self, capacity: int):
    """Initialize instance level attributes."""
    self.capacity = capacity
    # allocate memory
    self._array = self._allocate_memory(capacity)
    # Point to next available index to store data
    self.next_id = 0
    # Fill the array with None values
    self._prefill_array()

  def _prefill_array(self) -> None:
    """Prefill the array with None values."""
    for idx in range(self.next_id, self.capacity):
      self._array[idx] = None

  def _allocate_memory(self, capacity: int) -> Any:
    """Allocate memory to hold n objects.
    
    Args:
     capacity: Capacity of the array.

    Returns:
      Initialized array.
    """
    return (ctypes.py_object * capacity)()

  def __len__(self) -> int:
    """Returns the length of the array.

    Returns:
      Length of the array.
    """
    return self.next_id

  def __getitem__(self, index: int) -> Any:
    """Get an array element using index.

    Args:
      index: Index of the element.

    Returns:
      Element at that index.

    Raises:
      IndexError: If the index is out of range.
    """
    if not 0 <= index < self.next_id:
      return IndexError('Index out of range.')
    return self._array[index]

  def __setitem__(self, index: int, value: Any) -> None:
    """Set an array element using index.

    Args:
      index: Index of the element.
      value: Value of the element.

    Raises:
      IndexError: If the index is out of range.
    """
    if index > self.capacity - 1:
      raise IndexError("Index out of range")
    self._array[index] = value
    if self.next_id < index:
      self.next_id = index + 1

  def append(self, value: Any) -> None:
    """Append an element to the array.

    Args:
      value: Value of the element to append.
    """
    if self.next_id == self.capacity:
      self._increase_capacity(2 * self.capacity)

    self._array[self.next_id] = value
    self.next_id += 1

  def _increase_capacity(self, new_capacity: int) -> None:
    """Increase array capacity.

    Args:
      new_capacity: New capacity of the array.
    """
    _array = self._allocate_memory(new_capacity)
    # copy over the elements from the current array to the new one
    for i in range(self.next_id):
      _array[i] = self._array[i]

    self._array = _array
    # update capacity
    self.capacity = new_capacity
    self._prefill_array()

  def remove(self, index: int) -> None:
    """Remove an element at an index.

    Args:
      index: Index of the element to remove.

    Raises:
      IndexError: If index is out of range.
    """
    if 0 > index or index > self.next_id - 1:
      return IndexError('Index out of range.')

    for idx in range(index, self.next_id - 1):
      self._array[idx] = self._array[idx + 1]
    self.next_id += -1

  def pop(self) -> Any:
    """Remove and return the last element.

    Returns:
      Last element of the array.
    """
    if not self.next_id:
      raise ValueError("Empty array.")
    val = self._array[self.next_id - 1]
    self._array[self.next_id - 1] = None
    self.next_id -= 1
    return val

  def __repr__(self) -> str:
    """Representation of the Array object.

    Returns:
      A str representing the array.
    """
    return str([self._array[val] for val in range(self.next_id)])

Using the array

# Initialize an array to store 20 elements
>>> my_array = Array(capacity=20)
# store element 5 at index 0 (use append)
>>> my_array.append(5)
# Store using index, store 10 at index 1
>>> my_array[1] = 10
# Remove element at index 0
>>> my_array.remove(1)
# pop the last element
>>> my_array.pop()
5

Python inbuilt Array DS

Python also provide an inbuilt array implementation, which can be accessed as follows

import array

# initialize an array of integer with values [1, 10, 4, 3]
# here "i" mean signed integer data type.
>>> my_array = array.array("i", [1, 10, 4, 3])
>>> my_array[1]
10
>>> my_array[2]
4